The BLAST family of search programs (8
) is provided for the most frequent type
of analysis performed on GenBank, the sequence-similarity search.
NCBI’s Web interface to the standard BLAST 2.1 program
accepts either a sequence or accession number and performs the search
using either an identity matrix for blastn (nucleotide) searches
or a PAM or BLOSUM scoring matrix for protein searches. BLAST produces
a set of gapped alignments, with links to the full document records,
accompanied by an alignment score and a measure of statistical significance,
called the Expectation Value, for judging the quality of the alignment.
Web BLAST provides a graphical overview of the alignments, color-coded by
alignment score, which clearly shows the extent and quality of sequence
similarities, as well as the disposition of gaps in the alignments.
Web BLAST can also generate a taxonomically organized output that
emphasizes taxonomic patterns of sequence-similarity.
The default databases searched by BLAST are the non-redundant
(nr) nucleotide and protein databases constructed from the Entrez
databases. Several specialized databases may also be searched, and
searches may be restricted to sequences from a particular organism.
Query sequences may be filtered for low complexity or human repeats.
Customized BLAST pages allow queries against finished human genomic
data, microbial genomes or the genomes of malaria-associated pathogens.
Specialized versions of BLAST are offered for the needs of protein
similarity searching. Position Specific Iterated BLAST (PSI-BLAST)
) initially performs a conventional
BLAST search to produce alignments from which it constructs a position
specific score matrix (PSSM). Subsequent BLAST iterations use this
PSSM to find similarities in the database. Pattern Hit Initiated
BLAST (PHI-BLAST) (10
both a query sequence and a pattern present within the query sequence. The
pattern specifies an obligatory match between query and database
sequences, about which optimal local alignments are constructed.
Another variant, ‘BLAST2Sequences’ (11
two DNA or protein sequences and produces a dot-plot representation
of the alignments it reports.
Basic BLAST 2.0 searches can also be performed by email through
the address: blast/at/ncbi.nlm.nih.gov. Documentation can
be obtained by sending the word ‘help’ to the