PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (26)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
more »
1.  IgBLAST: an immunoglobulin variable domain sequence analysis tool 
Nucleic Acids Research  2013;41(Web Server issue):W34-W40.
The variable domain of an immunoglobulin (IG) sequence is encoded by multiple genes, including the variable (V) gene, the diversity (D) gene and the joining (J) gene. Analysis of IG sequences typically requires identification of each gene, as well as a comparison of sequence variations in the context of defined regions. General purpose tools, such as the BLAST program, have only limited use for such tasks, as the rearranged nature of an IG sequence and the variable length of each gene requires multiple rounds of BLAST searches for a single IG sequence. Additionally, manual assembly of different genes is difficult and error-prone. To address these issues and to facilitate other common tasks in analysing IG sequences, we have developed the sequence analysis tool IgBLAST (http://www.ncbi.nlm.nih.gov/igblast/). With this tool, users can view the matches to the germline V, D and J genes, details at rearrangement junctions, the delineation of IG V domain framework regions and complementarity determining regions. IgBLAST has the capability to analyse nucleotide and protein sequences and can process sequences in batches. Furthermore, IgBLAST allows searches against the germline gene databases and other sequence databases simultaneously to minimize the chance of missing possibly the best matching germline V gene.
doi:10.1093/nar/gkt382
PMCID: PMC3692102  PMID: 23671333
2.  BLAST: a more efficient report with usability improvements 
Nucleic Acids Research  2013;41(Web Server issue):W29-W33.
The Basic Local Alignment Search Tool (BLAST) website at the National Center for Biotechnology (NCBI) is an important resource for searching and aligning sequences. A new BLAST report allows faster loading of alignments, adds navigation aids, allows easy downloading of subject sequences and reports and has improved usability. Here, we describe these improvements to the BLAST report, discuss design decisions, describe other improvements to the search page and database documentation and outline plans for future development. The NCBI BLAST URL is http://blast.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkt282
PMCID: PMC3692093  PMID: 23609542
3.  Primer-BLAST: A tool to design target-specific primers for polymerase chain reaction 
BMC Bioinformatics  2012;13:134.
Background
Choosing appropriate primers is probably the single most important factor affecting the polymerase chain reaction (PCR). Specific amplification of the intended target requires that primers do not have matches to other targets in certain orientations and within certain distances that allow undesired amplification. The process of designing specific primers typically involves two stages. First, the primers flanking regions of interest are generated either manually or using software tools; then they are searched against an appropriate nucleotide sequence database using tools such as BLAST to examine the potential targets. However, the latter is not an easy process as one needs to examine many details between primers and targets, such as the number and the positions of matched bases, the primer orientations and distance between forward and reverse primers. The complexity of such analysis usually makes this a time-consuming and very difficult task for users, especially when the primers have a large number of hits. Furthermore, although the BLAST program has been widely used for primer target detection, it is in fact not an ideal tool for this purpose as BLAST is a local alignment algorithm and does not necessarily return complete match information over the entire primer range.
Results
We present a new software tool called Primer-BLAST to alleviate the difficulty in designing target-specific primers. This tool combines BLAST with a global alignment algorithm to ensure a full primer-target alignment and is sensitive enough to detect targets that have a significant number of mismatches to primers. Primer-BLAST allows users to design new target-specific primers in one step as well as to check the specificity of pre-existing primers. Primer-BLAST also supports placing primers based on exon/intron locations and excluding single nucleotide polymorphism (SNP) sites in primers.
Conclusions
We describe a robust and fully implemented general purpose primer design tool that designs target-specific PCR primers. Primer-BLAST offers flexible options to adjust the specificity threshold and other primer properties. This tool is publicly available at http://www.ncbi.nlm.nih.gov/tools/primer-blast.
doi:10.1186/1471-2105-13-134
PMCID: PMC3412702  PMID: 22708584
4.  New finite-size correction for local alignment score distributions 
BMC Research Notes  2012;5:286.
Background
Local alignment programs often calculate the probability that a match occurred by chance. The calculation of this probability may require a “finite-size” correction to the lengths of the sequences, as an alignment that starts near the end of either sequence may run out of sequence before achieving a significant score.
Findings
We present an improved finite-size correction that considers the distribution of sequence lengths rather than simply the corresponding means. This approach improves sensitivity and avoids substituting an ad hoc length for short sequences that can underestimate the significance of a match. We use a test set derived from ASTRAL to show improved ROC scores, especially for shorter sequences.
Conclusions
The new finite-size correction improves the calculation of probabilities for a local alignment. It is now used in the BLAST+ package and at the NCBI BLAST web site ( http://blast.ncbi.nlm.nih.gov).
doi:10.1186/1756-0500-5-286
PMCID: PMC3483159  PMID: 22691307
5.  Domain enhanced lookup time accelerated BLAST 
Biology Direct  2012;7:12.
Background
BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch.
Results
We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI’s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST.
Conclusions
DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the “Protein BLAST” link at http://blast.ncbi.nlm.nih.gov.
Reviewers
This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber.
doi:10.1186/1745-6150-7-12
PMCID: PMC3438057  PMID: 22510480
6.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2011;40(D1):D13-D25.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkr1184
PMCID: PMC3245031  PMID: 22140104
7.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2010;39(Database issue):D38-D51.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkq1172
PMCID: PMC3013733  PMID: 21097890
9.  BLAST+: architecture and applications 
BMC Bioinformatics  2009;10:421.
Background
Sequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications.
Results
We describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site.
Conclusion
The new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.
doi:10.1186/1471-2105-10-421
PMCID: PMC2803857  PMID: 20003500
10.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2009;38(Database issue):D5-D16.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, Reference Sequence, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Peptidome, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkp967
PMCID: PMC2808881  PMID: 19910364
12.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2008;37(Database issue):D5-D15.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkn741
PMCID: PMC2686545  PMID: 18940862
13.  Database indexing for production MegaBLAST searches 
Bioinformatics  2008;24(16):1757-1764.
Motivation: The BLAST software package for sequence comparison speeds up homology search by preprocessing a query sequence into a lookup table. Numerous research studies have suggested that preprocessing the database instead would give better performance. However, production usage of sequence comparison methods that preprocess the database has been limited to programs such as BLAT and SSAHA that are designed to find matches when query and database subsequences are highly similar.
Results: We developed a new version of the MegaBLAST module of BLAST that does the initial phase of finding short seeds for matches by searching a database index. We also developed a program makembindex that preprocesses the database into a data structure for rapid seed searching. We show that the new ‘indexed MegaBLAST’ is faster than the ‘non-indexed’ version for most practical uses. We show that indexed MegaBLAST is faster than miBLAST, another implementation of BLAST nucleotide searching with a preprocessed database, for most of the 200 queries we tested. To deploy indexed MegaBLAST as part of NCBI'sWeb BLAST service, the storage of databases and the queueing mechanism were modified, so that some machines are now dedicated to serving queries for a specific database. The response time for such Web queries is now faster than it was when each computer handled queries for multiple databases.
Availability: The code for indexed MegaBLAST is part of the blastn program in the NCBI C++ toolkit. The preprocessor program makembindex is also in the toolkit. Indexed MegaBLAST has been used in production on NCBI's Web BLAST service to search one version of the human and mouse genomes since October 2007. The Linux command-line executables for blastn and makembindex, documentation, and some query sets used to carry out the tests described below are available in the directory: ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/indexed_megablast
Contact: schaffer@helix.nih.gov
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn322
PMCID: PMC2696921  PMID: 18567917
14.  NCBI BLAST: a better web interface 
Nucleic Acids Research  2008;36(Web Server issue):W5-W9.
Basic Local Alignment Search Tool (BLAST) is a sequence similarity search program. The public interface of BLAST, http://www.ncbi.nlm.nih.gov/blast, at the NCBI website has recently been reengineered to improve usability and performance. Key new features include simplified search forms, improved navigation, a list of recent BLAST results, saved search strategies and a documentation directory. Here, we describe the BLAST web application's new features, explain design decisions and outline plans for future improvement.
doi:10.1093/nar/gkn201
PMCID: PMC2447716  PMID: 18440982
15.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2007;36(Database issue):D13-D21.
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkm1000
PMCID: PMC2238880  PMID: 18045790
16.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2006;35(Database issue):D5-D12.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at .
doi:10.1093/nar/gkl1031
PMCID: PMC1781113  PMID: 17170002
17.  BLAST: improvements for better sequence analysis 
Nucleic Acids Research  2006;34(Web Server issue):W6-W9.
Basic local alignment search tool (BLAST) is a sequence similarity search program. The National Center for Biotechnology Information (NCBI) maintains a BLAST server with a home page at . We report here on recent enhancements to the results produced by the BLAST server at the NCBI. These include features to highlight mismatches between similar sequences, show where the query was masked for low-complexity sequence, and integrate information about the database sequences from the NCBI Entrez system into the BLAST display. Changes to how the database sequences are fetched have also improved the speed of the report generator.
doi:10.1093/nar/gkl164
PMCID: PMC1538791  PMID: 16845079
18.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2005;34(Database issue):D173-D180.
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: .
doi:10.1093/nar/gkj158
PMCID: PMC1347520  PMID: 16381840
19.  BLAST: at the core of a powerful and diverse set of sequence analysis tools 
Nucleic Acids Research  2004;32(Web Server issue):W20-W25.
Basic Local Alignment Search Tool (BLAST) is one of the most heavily used sequence analysis tools available in the public domain. There is now a wide choice of BLAST algorithms that can be used to search many different sequence databases via the BLAST web pages (http://www.ncbi.nlm.nih.gov/BLAST/). All the algorithm–database combinations can be executed with default parameters or with customized settings, and the results can be viewed in a variety of ways. A new online resource, the BLAST Program Selection Guide, has been created to assist in the definition of search strategies. This article discusses optimal search strategies and highlights some BLAST features that can make your searches more powerful.
doi:10.1093/nar/gkh435
PMCID: PMC441573  PMID: 15215342
20.  Database resources of the National Center for Biotechnology Information: update 
Nucleic Acids Research  2004;32(Database issue):D35-D40.
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkh073
PMCID: PMC308807  PMID: 14681353
21.  Database resources of the National Center for Biotechnology 
Nucleic Acids Research  2003;31(1):28-33.
In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
PMCID: PMC165480  PMID: 12519941
22.  Database resources of the National Center for Biotechnology Information: 2002 update 
Nucleic Acids Research  2002;30(1):13-16.
In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human¡VMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov.
PMCID: PMC99094  PMID: 11752242
23.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements 
Nucleic Acids Research  2001;29(14):2994-3005.
PSI-BLAST is an iterative program to search a database for proteins with distant similarity to a query sequence. We investigated over a dozen modifications to the methods used in PSI-BLAST, with the goal of improving accuracy in finding true positive matches. To evaluate performance we used a set of 103 queries for which the true positives in yeast had been annotated by human experts, and a popular measure of retrieval accuracy (ROC) that can be normalized to take on values between 0 (worst) and 1 (best). The modifications we consider novel improve the ROC score from 0.758 ± 0.005 to 0.895 ± 0.003. This does not include the benefits from four modifications we included in the ‘baseline’ version, even though they were not implemented in PSI-BLAST version 2.0. The improvement in accuracy was confirmed on a small second test set. This test involved analyzing three protein families with curated lists of true positives from the non-redundant protein database. The modification that accounts for the majority of the improvement is the use, for each database sequence, of a position-specific scoring system tuned to that sequence’s amino acid composition. The use of composition-based statistics is particularly beneficial for large-scale automated applications of PSI-BLAST.
PMCID: PMC55814  PMID: 11452024
24.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2001;29(1):11-16.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, GeneMap’99, Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP), SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheri­tance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.
PMCID: PMC29800  PMID: 11125038
25.  Database resources of the National Center for Biotechnology Information 
Nucleic Acids Research  2000;28(1):10-14.
In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval and resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s Web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing pages, GeneMap’99, Davis Human–Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP) pages, Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, Cancer Genome Anatomy Project (CGAP) pages, SAGEmap, Online Mendelian Inheritance in Man (OMIM) and the Molecular Modeling Database (MMDB). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov
PMCID: PMC102437  PMID: 10592169

Results 1-25 (26)