Search tips
Search criteria

Results 1-23 (23)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
Document Types
1.  Integration of new alternative reference strain genome sequences into the Saccharomyces genome database 
The Saccharomyces Genome Database (SGD; is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences.
Database URL:
PMCID: PMC4888754  PMID: 27252399
2.  From one to many: expanding the Saccharomyces cerevisiae reference genome panel 
In recent years, thousands of Saccharomyces cerevisiae genomes have been sequenced to varying degrees of completion. The Saccharomyces Genome Database (SGD) has long been the keeper of the original eukaryotic reference genome sequence, which was derived primarily from S. cerevisiae strain S288C. Because new technologies are pushing S. cerevisiae annotation past the limits of any system based exclusively on a single reference sequence, SGD is actively working to expand the original S. cerevisiae systematic reference sequence from a single genome to a multi-genome reference panel. We first commissioned the sequencing of additional genomes and their automated analysis using the AGAPE pipeline. Here we describe our curation strategy to produce manually reviewed high-quality genome annotations in order to elevate 11 of these additional genomes to Reference status.
Database URL:
PMCID: PMC4795930  PMID: 26989152
3.  The Saccharomyces Genome Database Variant Viewer 
Nucleic Acids Research  2015;44(Database issue):D698-D702.
The Saccharomyces Genome Database (SGD; is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at
PMCID: PMC4702884  PMID: 26578556
4.  The Reference Genome Sequence of Saccharomyces cerevisiae: Then and Now 
G3: Genes|Genomes|Genetics  2013;4(3):389-398.
The genome of the budding yeast Saccharomyces cerevisiae was the first completely sequenced from a eukaryote. It was released in 1996 as the work of a worldwide effort of hundreds of researchers. In the time since, the yeast genome has been intensively studied by geneticists, molecular biologists, and computational scientists all over the world. Maintenance and annotation of the genome sequence have long been provided by the Saccharomyces Genome Database, one of the original model organism databases. To deepen our understanding of the eukaryotic genome, the S. cerevisiae strain S288C reference genome sequence was updated recently in its first major update since 1996. The new version, called “S288C 2010,” was determined from a single yeast colony using modern sequencing technologies and serves as the anchor for further innovations in yeast genomic science.
PMCID: PMC3962479  PMID: 24374639
Saccharomyces cerevisiae; model organism; reference sequence; genome release; S288C
5.  Saccharomyces genome database provides new regulation data 
Nucleic Acids Research  2013;42(Database issue):D717-D725.
The Saccharomyces Genome Database (SGD; is the community resource for genomic, gene and protein information about the budding yeast Saccharomyces cerevisiae, containing a variety of functional information about each yeast gene and gene product. We have recently added regulatory information to SGD and present it on a new tabbed section of the Locus Summary entitled ‘Regulation’. We are compiling transcriptional regulator–target gene relationships, which are curated from the literature at SGD or imported, with permission, from the YEASTRACT database. For nearly every S. cerevisiae gene, the Regulation page displays a table of annotations showing the regulators of that gene, and a graphical visualization of its regulatory network. For genes whose products act as transcription factors, the Regulation page also shows a table of their target genes, accompanied by a Gene Ontology enrichment analysis of the biological processes in which those genes participate. We additionally synthesize information from the literature for each transcription factor in a free-text Regulation Summary, and provide other information relevant to its regulatory function, such as DNA binding site motifs and protein domains. All of the regulation data are available for querying, analysis and download via YeastMine, the InterMine-based data warehouse system in use at SGD.
PMCID: PMC3965049  PMID: 24265222
6.  Saccharomyces Genome Database: the genomics resource of budding yeast 
Nucleic Acids Research  2011;40(Database issue):D700-D705.
The Saccharomyces Genome Database (SGD, is the community resource for the budding yeast Saccharomyces cerevisiae. The SGD project provides the highest-quality manually curated information from peer-reviewed literature. The experimental results reported in the literature are extracted and integrated within a well-developed database. These data are combined with quality high-throughput results and provided through Locus Summary pages, a powerful query engine and rich genome browser. The acquisition, integration and retrieval of these data allow SGD to facilitate experimental design and analysis by providing an encyclopedia of the yeast genome, its chromosomal features, their functions and interactions. Public access to these data is provided to researchers and educators via web pages designed for optimal ease of use.
PMCID: PMC3245034  PMID: 22110037
7.  Comparison of the Complete Protein Sets of Worm and Yeast: Orthology and Divergence 
Science (New York, N.Y.)  1998;282(5396):2022-2028.
Comparative analysis of predicted protein sequences encoded by the genomes of Caenorhabditis elegans and Saccharomyces cerevisiae suggests that most of the core biological functions are carried out by orthologous proteins (proteins of different species that can be traced back to a common ancestor) that occur in comparable numbers. The specialized processes of signal transduction and regulatory control that are unique to the multicellular worm appear to use novel proteins, many of which re-use conserved domains. Major expansion of the number of some of these domains seen in the worm may have contributed to the advent of multicellularity. The proteins conserved in yeast and worm are likely to have orthologs throughout eukaryotes; in contrast, the proteins unique to the worm may well define metazoans.
PMCID: PMC3057080  PMID: 9851918
8.  Genetic and physical maps of Saccharomyces cerevisiae 
Nature  1997;387(6632 Suppl):67-73.
Genetic and physical maps for the 16 chromosomes of Saccharomyces cerevisiae are presented. The genetic map is the result of 40 years of genetic analysis. The physical map was produced from the results of an international systematic sequencing effort. The data for the maps are accessible electronically from the Saccharomyces Genome Database (SGD:
PMCID: PMC3057085  PMID: 9169866
9.  GO::TermFinder—open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes 
Bioinformatics (Oxford, England)  2004;20(18):3710-3715.
GO::TermFinder comprises a set of object-oriented Perl modules for accessing Gene Ontology (GO) information and evaluating and visualizing the collective annotation of a list of genes to GO terms. It can be used to draw conclusions from microarray and other biological data, calculating the statistical significance of each annotation. GO::TermFinder can be used on any system on which Perl can be run, either as a command line application, in single or batch mode, or as a web-based CGI script.
The full source code and documentation for GO::TermFinder are freely available from
PMCID: PMC3037731  PMID: 15297299
10.  Saccharomyces genome database: Underlying principles and organisation 
Briefings in bioinformatics  2004;5(1):9-22.
A scientific database can be a powerful tool for biologists in an era where large-scale genomic analysis, combined with smaller-scale scientific results, provides new insights into the roles of genes and their products in the cell. However, the collection and assimilation of data is, in itself, not enough to make a database useful. The data must be incorporated into the database and presented to the user in an intuitive and biologically significant manner. Most importantly, this presentation must be driven by the user’s point of view; that is, from a biological perspective. The success of a scientific database can therefore be measured by the response of its users – statistically, by usage numbers and, in a less quantifiable way, by its relationship with the community it serves and its ability to serve as a model for similar projects. Since its inception ten years ago, the Saccharomyces Genome Database (SGD) has seen a dramatic increase in its usage, has developed and maintained a positive working relationship with the yeast research community, and has served as a template for at least one other database. The success of SGD, as measured by these criteria, is due in large part to philosophies that have guided its mission and organisation since it was established in 1993. This paper aims to detail these philosophies and how they shape the organisation and presentation of the database.
PMCID: PMC3037832  PMID: 15153302
S. cerevisiae; database; genome-wide analysis; bioinformatics; yeast
11.  Saccharomyces Genome Database provides mutant phenotype data 
Nucleic Acids Research  2009;38(Database issue):D433-D436.
The Saccharomyces Genome Database (SGD; is a scientific database for the molecular biology and genetics of the yeast Saccharomyces cerevisiae, which is commonly known as baker’s or budding yeast. The information in SGD includes functional annotations, mapping and sequence information, protein domains and structure, expression data, mutant phenotypes, physical and genetic interactions and the primary literature from which these data are derived. Here we describe how published phenotypes and genetic interaction data are annotated and displayed in SGD.
PMCID: PMC2808950  PMID: 19906697
12.  Gene Ontology annotations at SGD: new data sources and annotation methods 
Nucleic Acids Research  2007;36(Database issue):D577-D581.
The Saccharomyces Genome Database (SGD; collects and organizes biological information about the chromosomal features and gene products of the budding yeast Saccharomyces cerevisiae. Although published data from traditional experimental methods are the primary sources of evidence supporting Gene Ontology (GO) annotations for a gene product, high-throughput experiments and computational predictions can also provide valuable insights in the absence of an extensive body of literature. Therefore, GO annotations available at SGD now include high-throughput data as well as computational predictions provided by the GO Annotation Project (GOA UniProt; Because the annotation method used to assign GO annotations varies by data source, GO resources at SGD have been modified to distinguish data sources and annotation methods. In addition to providing information for genes that have not been experimentally characterized, GO annotations from independent sources can be compared to those made by SGD to help keep the literature-based GO annotations current.
PMCID: PMC2238894  PMID: 17982175
13.  Expanded protein information at SGD: new pages and proteome browser 
Nucleic Acids Research  2006;35(Database issue):D468-D471.
The recent explosion in protein data generated from both directed small-scale studies and large-scale proteomics efforts has greatly expanded the quantity of available protein information and has prompted the Saccharomyces Genome Database (SGD; ) to enhance the depth and accessibility of protein annotations. In particular, we have expanded ongoing efforts to improve the integration of experimental information and sequence-based predictions and have redesigned the protein information web pages. A key feature of this redesign is the development of a GBrowse-derived interactive Proteome Browser customized to improve the visualization of sequence-based protein information. This Proteome Browser has enabled SGD to unify the display of hidden Markov model (HMM) domains, protein family HMMs, motifs, transmembrane regions, signal peptides, hydropathy plots and profile hits using several popular prediction algorithms. In addition, a physico-chemical properties page has been introduced to provide easy access to basic protein information. Improvements to the layout of the Protein Information page and integration of the Proteome Browser will facilitate the ongoing expansion of sequence-specific experimental information captured in SGD, including post-translational modifications and other user-defined annotations. Finally, SGD continues to improve upon the availability of genetic and physical interaction data in an ongoing collaboration with BioGRID by providing direct access to more than 82 000 manually-curated interactions.
PMCID: PMC1669759  PMID: 17142221
14.  Genome Snapshot: a new resource at the Saccharomyces Genome Database (SGD) presenting an overview of the Saccharomyces cerevisiae genome 
Nucleic Acids Research  2005;34(Database issue):D442-D445.
Sequencing and annotation of the entire Saccharomyces cerevisiae genome has made it possible to gain a genome-wide perspective on yeast genes and gene products. To make this information available on an ongoing basis, the Saccharomyces Genome Database (SGD) () has created the Genome Snapshot (). The Genome Snapshot summarizes the current state of knowledge about the genes and chromosomal features of S.cerevisiae. The information is organized into two categories: (i) number of each type of chromosomal feature annotated in the genome and (ii) number and distribution of genes annotated to Gene Ontology terms. Detailed lists are accessible through SGD's Advanced Search tool (), and all the data presented on this page are available from the SGD ftp site ().
PMCID: PMC1347479  PMID: 16381907
15.  Tetrahymena Genome Database (TGD): a new genomic resource for Tetrahymena thermophila research 
Nucleic Acids Research  2005;34(Database issue):D500-D503.
We have developed a web-based resource (available at ) for researchers studying the model ciliate organism Tetrahymena thermophila. Employing the underlying database structure and programming of the Saccharomyces Genome Database, the Tetrahymena Genome Database (TGD) integrates the wealth of knowledge generated by the Tetrahymena research community about genome structure, genes and gene products with the newly sequenced macronuclear genome determined by The Institute for Genomic Research (TIGR). TGD provides information curated from the literature about each published gene, including a standardized gene name, a link to the genomic locus in our graphical genome browser, gene product annotations utilizing the Gene Ontology, links to published literature about the gene and more. TGD also displays automatic annotations generated for the gene models predicted by TIGR. A variety of tools are available at TGD for searching the Tetrahymena genome, its literature and information about members of the research community.
PMCID: PMC1347417  PMID: 16381920
16.  PatMatch: a program for finding patterns in peptide and nucleotide sequences 
Nucleic Acids Research  2005;33(Web Server issue):W262-W266.
Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497–498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265–1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at . The PatMatch server is available on the web at for searching Arabidopsis thaliana sequences.
PMCID: PMC1160129  PMID: 15980466
17.  Fungal BLAST and Model Organism BLASTP Best Hits: new comparison resources at the Saccharomyces Genome Database (SGD) 
Nucleic Acids Research  2004;33(Database Issue):D374-D377.
The Saccharomyces Genome Database (SGD; is a scientific database of gene, protein and genomic information for the yeast Saccharomyces cerevisiae. SGD has recently developed two new resources that facilitate nucleotide and protein sequence comparisons between S.cerevisiae and other organisms. The Fungal BLAST tool provides directed searches against all fungal nucleotide and protein sequences available from GenBank, divided into categories according to organism, status of completeness and annotation, and source. The Model Organism BLASTP Best Hits resource displays, for each S.cerevisiae protein, the single most similar protein from several model organisms and presents links to the database pages of those proteins, facilitating access to curated information about potential orthologs of yeast proteins.
PMCID: PMC539977  PMID: 15608219
18.  Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms 
Nucleic Acids Research  2004;32(Database issue):D311-D314.
The Saccharomyces Genome Database (SGD;, a scientific database of the molecular biology and genetics of the yeast Saccharomyces cerevisiae, has recently developed several new resources that allow the comparison and integration of information on a genome-wide scale, enabling the user not only to find detailed information about individual genes, but also to make connections across groups of genes with common features and across different species. The Fungal Alignment Viewer displays alignments of sequences from multiple fungal genomes, while the Sequence Similarity Query tool displays PSI-BLAST alignments of each S.cerevisiae protein with similar proteins from any species whose sequences are contained in the non-redundant (nr) protein data set at NCBI. The Yeast Biochemical Pathways tool integrates groups of genes by their common roles in metabolism and displays the metabolic pathways in a graphical form. Finally, the Find Chromosomal Features search interface provides a versatile tool for querying multiple types of information in SGD.
PMCID: PMC308767  PMID: 14681421
19.  Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins 
Nucleic Acids Research  2003;31(1):216-218.
The Saccharomyces Genome Database (SGD: has recently developed new resources to provide more complete information about proteins from the budding yeast Saccharomyces cerevisiae. The PDB Homologs page provides structural information from the Protein Data Bank (PDB) about yeast proteins and/or their homologs. SGD has also created a resource that utilizes the eMOTIF database for motif information about a given protein. A third new resource is the Protein Information page, which contains protein physical and chemical properties, such as molecular weight and hydropathicity scores, predicted from the translated ORF sequence.
PMCID: PMC165501  PMID: 12519985
20.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) 
Nucleic Acids Research  2002;30(1):69-72.
The Saccharomyces Genome Database (SGD) resources, ranging from genetic and physical maps to genome-wide analysis tools, reflect the scientific progress in identifying genes and their functions over the last decade. As emphasis shifts from identification of the genes to identification of the role of their gene products in the cell, SGD seeks to provide its users with annotations that will allow relationships to be made between gene products, both within Saccharomyces cerevisiae and across species. To this end, SGD is annotating genes to the Gene Ontology (GO), a structured representation of biological knowledge that can be shared across species. The GO consists of three separate ontologies describing molecular function, biological process and cellular component. The goal is to use published information to associate each characterized S.cerevisiae gene product with one or more GO terms from each of the three ontologies. To be useful, this must be done in a manner that allows accurate associations based on experimental evidence, modifications to GO when necessary, and careful documentation of the annotations through evidence codes for given citations. Reaching this goal is an ongoing process at SGD. For information on the current progress of GO annotations at SGD and other participating databases, as well as a description of each of the three ontologies, please visit the GO Consortium page at SGD gene associations to GO can be found by visiting our site at
PMCID: PMC99086  PMID: 11752257
21.  The Stanford Microarray Database 
Nucleic Acids Research  2001;29(1):152-155.
The Stanford Microarray Database (SMD) stores raw and normalized data from microarray experiments, and provides web interfaces for researchers to retrieve, analyze and visualize their data. The two immediate goals for SMD are to serve as a storage site for microarray data from ongoing research at Stanford University, and to facilitate the public dissemination of that data once published, or released by the researcher. Of paramount importance is the connection of microarray data with the biological data that pertains to the DNA deposited on the microarray (genes, clones etc.). SMD makes use of many public resources to connect expression information to the relevant biology, including SGD [Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Issel-Tarver,L., Kasarskis,A., Scafe,C.R., Sherlock,G., Binkley,G., Jin,H. et al. (2000) Nucleic Acids Res., 28, 77–80], YPD and WormPD [Costanzo,M.C., Hogan,J.D., Cusick,M.E., Davis,B.P., Fancher,A.M., Hodges,P.E., Kondu,P., Lengieza,C., Lew-Smith,J.E., Lingner,C. et al. (2000) Nucleic Acids Res., 28, 73–76], Unigene [Wheeler,D.L., Chappey,C., Lash,A.E., Leipe,D.D., Madden,T.L., Schuler,G.D., Tatusova,T.A. and Rapp,B.A. (2000) Nucleic Acids Res., 28, 10–14], dbEST [Boguski,M.S., Lowe,T.M. and Tolstoshev,C.M. (1993) Nature Genet., 4, 332–333] and SWISS-PROT [Bairoch,A. and Apweiler,R. (2000) Nucleic Acids Res., 28, 45–48] and can be accessed at
PMCID: PMC29818  PMID: 11125075
22.  Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data 
Nucleic Acids Research  2001;29(1):80-81.
Upon the completion of the Saccharomyces cerevisiae genomic sequence in 1996 [Goffeau,A. et al. (1997) Nature, 387, 5], several creative and ambitious projects have been initiated to explore the functions of gene products or gene expression on a genome-wide scale. To help researchers take advantage of these projects, the Saccharomyces Genome Database (SGD) has created two new tools, Function Junction and Expression Connection. Together, the tools form a central resource for querying multiple large-scale analysis projects for data about individual genes. Function Junction provides information from diverse projects that shed light on the role a gene product plays in the cell, while Expression Connection delivers information produced by the ever-increasing number of microarray projects. WWW access to SGD is available at
PMCID: PMC29796  PMID: 11125055
23.  Integrating functional genomic information into the Saccharomyces Genome Database 
Nucleic Acids Research  2000;28(1):77-80.
The Saccharomyces Genome Database (SGD) stores and organizes information about the nearly 6200 genes in the yeast genome. The information is organized around the ‘locus page’ and directs users to the detailed information they seek. SGD is endeavoring to integrate the existing information about yeast genes with the large volume of data generated by functional analyses that are beginning to appear in the literature and on web sites. New features will include searches of systematic analyses and Gene Summary Paragraphs that succinctly review the literature for each gene. In addition to current information, such as gene product and phenotype descriptions, the new locus page will also describe a gene product’s cellular process, function and localization using a controlled vocabulary developed in collaboration with two other model organism databases. We describe these developments in SGD through the newly reorganized locus page. The SGD is accessible via the WWW at http://genome-www. stanford. edu/Saccharomyces/
PMCID: PMC102447  PMID: 10592186

Results 1-23 (23)