NCBI site redesign
In late 2009, NCBI launched a long-term project of redesigning and standardizing the NCBI website. Containing more than 4000 pages, the NCBI website is a complex system of interconnected resources, many of which have unique design aspects that can make navigating the NCBI site challenging. To alleviate this, we have adopted a new set of web design standards and have applied them to several resources so far including PubMed, Nuccore, EST, GSS, Protein, Gene, dbVar and Epigenomics. The new pages have four standard elements: (i) the page header, which contains links to the NCBI home page and MyNCBI as well as two pull-down menus that provide navigation to NCBI resources and how-to guides; (ii) the search bar, which contains a pull-down menu of all Entrez databases along with links to search tools and help documentation; (iii) the page body, containing the page content such as search results or data records; and (iv) the page footer, containing five lists of links to information about NCBI, lists of categorized resources and several popular or featured resources. In the coming months, more resources will be adopting this new design that we expect will make the NCBI site more consistent and easier to navigate.
Common elements in the new Entrez page designs
In addition to the standard header and footer, resources that have been updated to conform to the new Entrez design share several common elements: a home page, search tools, display controls and download controls. The home page of a data resource (e.g. www.ncbi.nlm.nih.gov/protein/
) contains links to documentation and other information for new users, to relevant tools and to related resources at NCBI. On pages containing search results and data records, new ‘Display Settings’ and ‘Send to’ controls appear on the left and right sides of the display, respectively. These new and simplified controls replace sets of pull-down menus and allow users to select multiple settings at once.
The NCBI Guide
In conjunction with the new web standards discussed above, we replaced the old NCBI home page with the NCBI Guide, an application that serves as an interactive directory of the NCBI site. On the main page of the NCBI Guide, the categories in the Resource pull-down menu in the standard header are duplicated in a list on the left of the page. Clicking on any category displays a list of relevant resources sorted into four groups: databases, downloads, submissions and tools. Popular resources are listed on the right under a ‘Quick Links’ heading. A list of how-to guides is also available via the ‘How-To’ tab on these pages. A list of the most heavily used resources is provided on the main Guide page in the ‘Popular Resources’ box and also as a list in the standard footer.
The Epigenomics database (www.ncbi.nlm.nih.gov/epigenomics/
) is a new information resource at NCBI specifically aimed at highlighting epigenomics data. Epigenomics is an emerging field of research that studies how, despite sharing a common genomic sequence, different cell types and cell lineages acquire distinct patterns of gene expression. Epigenetic features examined include post-translational modifications of histone proteins, genomic DNA methylation, chromatin organization and the expression of non-coding regulatory RNA. Raw data from these experiments, together with extensive meta-data, are stored in the GEO (Gene Expression Omnibus) and SRA (Sequence Read Archive) databases. The new Epigenomics resource provides a higher-level view, allowing users to search and browse the data based on biological attributes such as cell type, tissue type, differentiation stage and heath status, among many others. Data have been pre-mapped to genomic coordinates (to make ‘genome tracks’), so users are not required to be familiar with or manipulate the raw data. Tracks may be visualized in either the NCBI or UCSC genome viewers or may be downloaded to the user’s computer for local analysis. Data from the Roadmap Epigenomics project, which are currently being hosted at GEO (www.ncbi.nlm.nih.gov/geo/roadmap/epigenomics/
), are being mirrored and are available for viewing and downloading from this new resource.
Database of Genomic Structural Variation
In 2010, NCBI launched the Database of Genomic Structural Variation (dbVar), an archive of large-scale genomic variants such as insertions, deletions, translocations and inversions (www.ncbi.nlm.nih.gov/dbvar/
). Currently, dbVar (2
) contains over 50 studies from human, rhesus macaque, chimpanzee, mouse, dog, fruit fly and pig, and accepts data derived from several methods including computational sequence analysis and microarray experiments. Each variant is linked to a graphical view showing its genomic context.
Inferred Biomolecular Interactions Server
Recently, NCBI introduced the Inferred Biomoleculars Interactions Server (IBIS), a research server that analyzes and predicts interaction partners and binding site locations in proteins (3
). IBIS (www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi
) integrates the interactions observed in structural complexes from the Molecular Modeling Database (MMDB) for different types of binding partners including proteins, chemical ligands, nucleic acids, peptides and ions. IBIS also infers binding sites and partners from homologous protein complexes. To emphasize biologically relevant binding sites, similar sites are clustered together based on their evolutionary conservation. In the future, NCBI plans to incorporate observed and inferred interactions of this kind throughout the Entrez 3D structure resources.
New outreach resources and services
NCBI recently redesigned its main Education page (www.ncbi.nlm.nih.gov/Education/
) and introduced several new outreach initiatives including training webinars and a new series of courses called Discovery Workshops (4
). The new page has links to documentation, educational tools, upcoming conference exhibits and news items. Also on the page are links to the new NCBI pages on Facebook and Twitter, plus YouTube pages that contain short video tutorials and videos from special events at NCBI.
BLAST and COBALT updates
The Short Read Archive (SRA) BLAST page, accessible from the ‘Specialized BLAST’ section of the main BLAST page (blast.ncbi.nlm.nih.gov), now has an option for searching WGS sequences from 454 Sequencing systems. The WGS sequences are grouped by genus in a pull-down menu, and if multiple species have data within a genus, a separate menu appears allowing individual species to be selected. These data sets are updated daily, so new WGS data are available for searching quickly. The standard BLAST pages now have additional options for filtering searches. If the ‘Align two or more sequences’ checkbox is not checked, users can either include or exclude data from any number of specified organisms or taxons, greatly increasing the range of customized data sets available. In addition, checkboxes are available that allow users to exclude ‘model’ sequences (RefSeq XM and XP accessions) as well as sequences from uncultured or environmental samples. Finally, COBALT (5
) users can download the output multiple alignment to a file in several popular formats including gapped FASTA, ClustalW, Phylip and Nexus.
MyNCBI allows users to store personal configuration options such as search filters, LinkOut preferences and document delivery providers. Several enhancements have been made to MyNCBI in the past year, including an update to allow users to sign in using credentials for an account with a partner organization such as Google, eRA Commons, VeriSign or a local university. My Bibliography was enhanced to allow users to add citations from books, meetings, presentations, patents and articles not found in PubMed, and also to give users the ability to manage their compliance with the NIH Public Access Policy. In addition, the number of PubMed filter selections has been expanded from five to 15, and users may now change their PubMed default settings for display format, items per page, and the method for sorting search results.
Updates to literature resources
In addition to the changes outlined above for PubMed as part of the Entrez redesign, NCBI released several enhancements for both PubMed and PubMed Central (PMC). For the first time, PubMed now includes citations for book and book chapters available on the NCBI Bookshelf. To aid in searching, an autocomplete feature was added to the PubMed search box, and the PubMed Clinical Queries page (www.ncbi.nlm.nih.gov/pubmed/clinical
) was redesigned to show immediate results for clinical studies, systematic reviews and medical genetics side by side. To assist users in finding related literature, PMC full-text views now include a list of related PubMed abstracts on the right. In addition, links to PubMed abstracts cited in the text now appear to the right of the paragraph containing the citation.
New discovery components within the Entrez system
NCBI continued to add new discovery components that assist researchers in finding particular Entrez links and using them to discover interesting relationships within the NCBI databases. Two such components were introduced on protein sequence view pages: an ad that alerts users that the protein being viewed is part of a biological pathway or other system within the Biosystems database, and which provides a link to that pathway; and an ad that describes and links to a cluster of sequences in the Protein Clusters database that includes the protein being viewed. Both of these ads appear on the right column of the sequence view page. For search operations that retrieve 20 or fewer nucleotide or protein sequences, links now appear in the right column that allow users to run BLAST and/or COBALT on all or any checked subset of the sequences.