PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-16 (16)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  The UCSC Ebola Genome Portal 
PLoS Currents  2014;6:ecurrents.outbreaks.386ab0964ab4d6c8cb550bfb6071d822.
Background: With the Ebola epidemic raging out of control in West Africa, there has been a flurry of research into the Ebola virus, resulting in the generation of much genomic data. Methods: In response to the clear need for tools that integrate multiple strands of research around molecular sequences, we have created the University of California Santa Cruz (UCSC) Ebola Genome Browser, an adaptation of our popular UCSC Genome Browser web tool, which can be used to view the Ebola virus genome sequence from GenBank and nearly 30 annotation tracks generated by mapping external data to the reference sequence. Significant annotations include a multiple alignment comprising 102 Ebola genomes from the current outbreak, 56 from previous outbreaks, and 2 Marburg genomes as an outgroup; a gene track curated by NCBI; protein annotations curated by UniProt and antibody-binding epitopes curated by IEDB. We have extended the Genome Browser’s multiple alignment color-coding scheme to distinguish mutations resulting from non-synonymous coding changes, synonymous changes, or changes in untranslated regions. Discussion: Our Ebola Genome portal at http://genome.ucsc.edu/ebolaPortal/ links to the Ebola virus Genome Browser and an aggregate of useful information, including a collection of Ebola antibodies we are curating.
doi:10.1371/currents.outbreaks.386ab0964ab4d6c8cb550bfb6071d822
PMCID: PMC4318873
ebola; ebolavirus; EBOV; genome analysis; genomics
2.  The UCSC Genome Browser database: 2014 update 
Nucleic Acids Research  2013;42(Database issue):D764-D770.
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a large collection of organisms, primarily vertebrates, with an emphasis on the human and mouse genomes. The Browser’s web-based tools provide an integrated environment for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic data sets. As of September 2013, the database contained genomic sequence and a basic set of annotation ‘tracks’ for ∼90 organisms. Significant new annotations include a 60-species multiple alignment conservation track on the mouse, updated UCSC Genes tracks for human and mouse, and several new sets of variation and ENCODE data. New software tools include a Variant Annotation Integrator that returns predicted functional effects of a set of variants uploaded as a custom track, an extension to UCSC Genes that displays haplotype alleles for protein-coding genes and an expansion of data hubs that includes the capability to display remotely hosted user-provided assembly sequence in addition to annotation data. To improve European access, we have added a Genome Browser mirror (http://genome-euro.ucsc.edu) hosted at Bielefeld University in Germany.
doi:10.1093/nar/gkt1168
PMCID: PMC3964947  PMID: 24270787
3.  The UCSC Genome Browser database: extensions and updates 2013 
Nucleic Acids Research  2012;41(Database issue):D64-D69.
The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation ‘tracks’ are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.
doi:10.1093/nar/gks1048
PMCID: PMC3531082  PMID: 23155063
4.  The UCSC Genome Browser 
The University of California Santa Cruz (UCSC) Genome Browser is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation “tracks.” The annotations generated by the UCSC Genome Bioinformatics Group and external collaborators include gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload personal datasets in a wide variety of formats as custom annotation tracks in both browsers for research or educational purposes.
doi:10.1002/0471142905.hg1806s71
PMCID: PMC3222792  PMID: 21975940
Genome Browser; Table Browser; human genome; genome analysis; comparative genomics; human variation; next-gen sequencing; human genetics analysis; biological databases; BAM
5.  G-NEST: a gene neighborhood scoring tool to identify co-conserved, co-expressed genes 
BMC Bioinformatics  2012;13:253.
Background
In previous studies, gene neighborhoods—spatial clusters of co-expressed genes in the genome—have been defined using arbitrary rules such as requiring adjacency, a minimum number of genes, a fixed window size, or a minimum expression level. In the current study, we developed a Gene Neighborhood Scoring Tool (G-NEST) which combines genomic location, gene expression, and evolutionary sequence conservation data to score putative gene neighborhoods across all possible window sizes simultaneously.
Results
Using G-NEST on atlases of mouse and human tissue expression data, we found that large neighborhoods of ten or more genes are extremely rare in mammalian genomes. When they do occur, neighborhoods are typically composed of families of related genes. Both the highest scoring and the largest neighborhoods in mammalian genomes are formed by tandem gene duplication. Mammalian gene neighborhoods contain highly and variably expressed genes. Co-localized noisy gene pairs exhibit lower evolutionary conservation of their adjacent genome locations, suggesting that their shared transcriptional background may be disadvantageous. Genes that are essential to mammalian survival and reproduction are less likely to occur in neighborhoods, although neighborhoods are enriched with genes that function in mitosis. We also found that gene orientation and protein-protein interactions are partially responsible for maintenance of gene neighborhoods.
Conclusions
Our experiments using G-NEST confirm that tandem gene duplication is the primary driver of non-random gene order in mammalian genomes. Non-essentiality, co-functionality, gene orientation, and protein-protein interactions are additional forces that maintain gene neighborhoods, especially those formed by tandem duplicates. We expect G-NEST to be useful for other applications such as the identification of core regulatory modules, common transcriptional backgrounds, and chromatin domains. The software is available at http://docpollard.org/software.html
doi:10.1186/1471-2105-13-253
PMCID: PMC3575404  PMID: 23020263
Computational biology; Genomics; Gene expression; Gene duplication; Transcription; Cluster analysis; Gene neighborhood; Gene cluster; Bioinformatics; Evolution
6.  The UCSC Genome Browser database: extensions and updates 2011 
Nucleic Acids Research  2011;40(Database issue):D918-D923.
The University of California Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic data sets. In the past year, the local database has been updated with four new species assemblies, and we anticipate another four will be released by the end of 2011. Further, a large number of annotation tracks have been either added, updated by contributors, or remapped to the latest human reference genome. Among these are new phenotype and disease annotations, UCSC genes, and a major dbSNP update, which required new visualization methods. Growing beyond the local database, this year we have introduced ‘track data hubs’, which allow the Genome Browser to provide access to remotely located sets of annotations. This feature is designed to significantly extend the number and variety of annotation tracks that are publicly available for visualization and analysis from within our site. We have also introduced several usability features including track search and a context-sensitive menu of options available with a right-click anywhere on the Browser's image.
doi:10.1093/nar/gkr1055
PMCID: PMC3245018  PMID: 22086951
7.  ENCODE whole-genome data in the UCSC Genome Browser: update 2012 
Nucleic Acids Research  2011;40(Database issue):D912-D917.
The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets.
doi:10.1093/nar/gkr1012
PMCID: PMC3245183  PMID: 22075998
8.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project 
Gerstein, Mark B. | Lu, Zhi John | Van Nostrand, Eric L. | Cheng, Chao | Arshinoff, Bradley I. | Liu, Tao | Yip, Kevin Y. | Robilotto, Rebecca | Rechtsteiner, Andreas | Ikegami, Kohta | Alves, Pedro | Chateigner, Aurelien | Perry, Marc | Morris, Mitzi | Auerbach, Raymond K. | Feng, Xin | Leng, Jing | Vielle, Anne | Niu, Wei | Rhrissorrakrai, Kahn | Agarwal, Ashish | Alexander, Roger P. | Barber, Galt | Brdlik, Cathleen M. | Brennan, Jennifer | Brouillet, Jeremy Jean | Carr, Adrian | Cheung, Ming-Sin | Clawson, Hiram | Contrino, Sergio | Dannenberg, Luke O. | Dernburg, Abby F. | Desai, Arshad | Dick, Lindsay | Dosé, Andréa C. | Du, Jiang | Egelhofer, Thea | Ercan, Sevinc | Euskirchen, Ghia | Ewing, Brent | Feingold, Elise A. | Gassmann, Reto | Good, Peter J. | Green, Phil | Gullier, Francois | Gutwein, Michelle | Guyer, Mark S. | Habegger, Lukas | Han, Ting | Henikoff, Jorja G. | Henz, Stefan R. | Hinrichs, Angie | Holster, Heather | Hyman, Tony | Iniguez, A. Leo | Janette, Judith | Jensen, Morten | Kato, Masaomi | Kent, W. James | Kephart, Ellen | Khivansara, Vishal | Khurana, Ekta | Kim, John K. | Kolasinska-Zwierz, Paulina | Lai, Eric C. | Latorre, Isabel | Leahey, Amber | Lewis, Suzanna | Lloyd, Paul | Lochovsky, Lucas | Lowdon, Rebecca F. | Lubling, Yaniv | Lyne, Rachel | MacCoss, Michael | Mackowiak, Sebastian D. | Mangone, Marco | McKay, Sheldon | Mecenas, Desirea | Merrihew, Gennifer | Miller, David M. | Muroyama, Andrew | Murray, John I. | Ooi, Siew-Loon | Pham, Hoang | Phippen, Taryn | Preston, Elicia A. | Rajewsky, Nikolaus | Rätsch, Gunnar | Rosenbaum, Heidi | Rozowsky, Joel | Rutherford, Kim | Ruzanov, Peter | Sarov, Mihail | Sasidharan, Rajkumar | Sboner, Andrea | Scheid, Paul | Segal, Eran | Shin, Hyunjin | Shou, Chong | Slack, Frank J. | Slightam, Cindie | Smith, Richard | Spencer, William C. | Stinson, E. O. | Taing, Scott | Takasaki, Teruaki | Vafeados, Dionne | Voronina, Ksenia | Wang, Guilin | Washington, Nicole L. | Whittle, Christina M. | Wu, Beijing | Yan, Koon-Kiu | Zeller, Georg | Zha, Zheng | Zhong, Mei | Zhou, Xingliang | Ahringer, Julie | Strome, Susan | Gunsalus, Kristin C. | Micklem, Gos | Liu, X. Shirley | Reinke, Valerie | Kim, Stuart K. | Hillier, LaDeana W. | Henikoff, Steven | Piano, Fabio | Snyder, Michael | Stein, Lincoln | Lieb, Jason D. | Waterston, Robert H.
Science (New York, N.Y.)  2010;330(6012):1775-1787.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
doi:10.1126/science.1196914
PMCID: PMC3142569  PMID: 21177976
9.  The UCSC Genome Browser 
The University of California Santa Cruz (UCSC) Genome Browser (genome.ucsc.edu) is a popular Web-based tool for quickly displaying a requested portion of a genome at any scale, accompanied by a series of aligned annotation “tracks”. The annotations—generated by the UCSC Genome Bioinformatics Group and external collaborators—display gene predictions, mRNA and expressed sequence tag alignments, simple nucleotide polymorphisms, expression and regulatory data, phenotype and variation data, and pairwise and multiple-species comparative genomics data. All information relevant to a region is presented in one window, facilitating biological analysis and interpretation. The database tables underlying the Genome Browser tracks can be viewed, downloaded, and manipulated using another Web-based application, the UCSC Table Browser. Users can upload data as custom annotation tracks in both browsers for research or educational use. This unit describes how to use the Genome Browser and Table Browser for genome analysis, download the underlying database tables, and create and display custom annotation tracks.
doi:10.1002/0471250953.bi0104s28
PMCID: PMC2834533  PMID: 19957273
Genome Browser; Table Browser; UCSC; human genome; genome analysis; comparative genomics; human variation; Bioinformatics; Bioinformatics Fundamentals; Biological Databases
10.  ENCODE whole-genome data in the UCSC genome browser (2011 update) 
Nucleic Acids Research  2010;39(Database issue):D871-D875.
The ENCODE project is an international consortium with a goal of cataloguing all the functional elements in the human genome. The ENCODE Data Coordination Center (DCC) at the University of California, Santa Cruz serves as the central repository for ENCODE data. In this role, the DCC offers a collection of high-throughput, genome-wide data generated with technologies such as ChIP-Seq, RNA-Seq, DNA digestion and others. This data helps illuminate transcription factor-binding sites, histone marks, chromatin accessibility, DNA methylation, RNA expression, RNA binding and other cell-state indicators. It includes sequences with quality scores, alignments, signals calculated from the alignments, and in most cases, element or peak calls calculated from the signal data. Each data set is available for visualization and download via the UCSC Genome Browser (http://genome.ucsc.edu/). ENCODE data can also be retrieved using a metadata system that captures the experimental parameters of each assay. The ENCODE web portal at UCSC (http://encodeproject.org/) provides information about the ENCODE data and links for access.
doi:10.1093/nar/gkq1017
PMCID: PMC3013645  PMID: 21037257
11.  The UCSC Genome Browser database: update 2011 
Nucleic Acids Research  2010;39(Database issue):D876-D882.
The University of California, Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online access to a database of genomic sequence and annotation data for a wide variety of organisms. The Browser also has many tools for visualizing, comparing and analyzing both publicly available and user-generated genomic data sets, aligning sequences and uploading user data. Among the features released this year are a gene search tool and annotation track drag-reorder functionality as well as support for BAM and BigWig/BigBed file formats. New display enhancements include overlay of multiple wiggle tracks through use of transparent coloring, options for displaying transformed wiggle data, a ‘mean+whiskers’ windowing function for display of wiggle data at high zoom levels, and more color schemes for microarray data. New data highlights include seven new genome assemblies, a Neandertal genome data portal, phenotype and disease association data, a human RNA editing track, and a zebrafish Conservation track. We also describe updates to existing tracks.
doi:10.1093/nar/gkq963
PMCID: PMC3242726  PMID: 20959295
12.  ENCODE whole-genome data in the UCSC Genome Browser 
Nucleic Acids Research  2009;38(Database issue):D620-D625.
The Encyclopedia of DNA Elements (ENCODE) project is an international consortium of investigators funded to analyze the human genome with the goal of producing a comprehensive catalog of functional elements. The ENCODE Data Coordination Center at The University of California, Santa Cruz (UCSC) is the primary repository for experimental results generated by ENCODE investigators. These results are captured in the UCSC Genome Bioinformatics database and download server for visualization and data mining via the UCSC Genome Browser and companion tools (Rhead et al. The UCSC Genome Browser Database: update 2010, in this issue). The ENCODE web portal at UCSC (http://encodeproject.org or http://genome.ucsc.edu/ENCODE) provides information about the ENCODE data and convenient links for access.
doi:10.1093/nar/gkp961
PMCID: PMC2808953  PMID: 19920125
13.  The UCSC Genome Browser database: update 2010 
Nucleic Acids Research  2009;38(Database issue):D613-D619.
The University of California, Santa Cruz (UCSC) Genome Browser website (http://genome.ucsc.edu/) provides a large database of publicly available sequence and annotation data along with an integrated tool set for examining and comparing the genomes of organisms, aligning sequence to genomes, and displaying and sharing users’ own annotation data. As of September 2009, genomic sequence and a basic set of annotation ‘tracks’ are provided for 47 organisms, including 14 mammals, 10 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms and a yeast. New data highlights this year include an updated human genome browser, a 44-species multiple sequence alignment track, improved variation and phenotype tracks and 16 new genome-wide ENCODE tracks. New features include drag-and-zoom navigation, a Wiki track for user-added annotations, new custom track formats for large datasets (bigBed and bigWig), a new multiple alignment output tool, links to variation and protein structure tools, in silico PCR utility enhancements, and improved track configuration tools.
doi:10.1093/nar/gkp939
PMCID: PMC2808870  PMID: 19906737
14.  The bovine lactation genome: insights into the evolution of mammalian milk 
Genome Biology  2009;10(4):R43.
Comparison of milk protein and mammary genes in the bovine genome with those from other mammals gives insights into the evolution of lactation.
Background
The newly assembled Bos taurus genome sequence enables the linkage of bovine milk and lactation data with other mammalian genomes.
Results
Using publicly available milk proteome data and mammary expressed sequence tags, 197 milk protein genes and over 6,000 mammary genes were identified in the bovine genome. Intersection of these genes with 238 milk production quantitative trait loci curated from the literature decreased the search space for milk trait effectors by more than an order of magnitude. Genome location analysis revealed a tendency for milk protein genes to be clustered with other mammary genes. Using the genomes of a monotreme (platypus), a marsupial (opossum), and five placental mammals (bovine, human, dog, mice, rat), gene loss and duplication, phylogeny, sequence conservation, and evolution were examined. Compared with other genes in the bovine genome, milk and mammary genes are: more likely to be present in all mammals; more likely to be duplicated in therians; more highly conserved across Mammalia; and evolving more slowly along the bovine lineage. The most divergent proteins in milk were associated with nutritional and immunological components of milk, whereas highly conserved proteins were associated with secretory processes.
Conclusions
Although both copy number and sequence variation contribute to the diversity of milk protein composition across species, our results suggest that this diversity is primarily due to other mechanisms. Our findings support the essentiality of milk to the survival of mammalian neonates and the establishment of milk secretory mechanisms more than 160 million years ago.
doi:10.1186/gb-2009-10-4-r43
PMCID: PMC2688934  PMID: 19393040
15.  Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures 
Nature  2007;450(7167):219-232.
Sequencing of multiple related species followed by comparative genomics analysis constitutes a powerful approach for the systematic understanding of any genome. Here, we use the genomes of 12 Drosophila species for the de novo discovery of functional elements in the fly. Each type of functional element shows characteristic patterns of change, or ‘evolutionary signatures’, dictated by its precise selective constraints. Such signatures enable recognition of new protein-coding genes and exons, spurious and incorrect gene annotations, and numerous unusual gene structures, including abundant stop-codon readthrough. Similarly, we predict non-protein-coding RNA genes and structures, and new microRNA (miRNA) genes. We provide evidence of miRNA processing and functionality from both hairpin arms and both DNA strands. We identify several classes of pre- and post-transcriptional regulatory motifs, and predict individual motif instances with high confidence. We also study how discovery power scales with the divergence and number of species compared, and we provide general guidelines for comparative studies.
doi:10.1038/nature06340
PMCID: PMC2474711  PMID: 17994088
16.  The ENCODE Project at UC Santa Cruz 
Nucleic Acids Research  2006;35(Database issue):D663-D667.
The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website () is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.
doi:10.1093/nar/gkl1017
PMCID: PMC1781110  PMID: 17166863

Results 1-16 (16)