Search tips
Search criteria

Results 1-4 (4)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  Incorporation of biological knowledge into distance for clustering genes 
Bioinformation  2007;1(10):396-405.
In this paper we propose a data based algorithm to marry existing biological knowledge (e.g., functional annotations of genes) with experimental data (gene expression profiles) in creating an overall dissimilarity that can be used with any clustering algorithm that uses a general dissimilarity matrix. We explore this idea with two publicly available gene expression data sets and functional annotations where the results are compared with the clustering results that uses only the experimental data. Although more elaborate evaluations might be called for, the present paper makes a strong case for utilizing existing biological information in the clustering process.
Supplement is available at
PMCID: PMC1896054  PMID: 17597929
knowledge; distance; clustering; genes; expression
2.  BLAST: a more efficient report with usability improvements 
Nucleic Acids Research  2013;41(Web Server issue):W29-W33.
The Basic Local Alignment Search Tool (BLAST) website at the National Center for Biotechnology (NCBI) is an important resource for searching and aligning sequences. A new BLAST report allows faster loading of alignments, adds navigation aids, allows easy downloading of subject sequences and reports and has improved usability. Here, we describe these improvements to the BLAST report, discuss design decisions, describe other improvements to the search page and database documentation and outline plans for future development. The NCBI BLAST URL is
PMCID: PMC3692093  PMID: 23609542
3.  Domain enhanced lookup time accelerated BLAST 
Biology Direct  2012;7:12.
BLAST is a commonly-used software package for comparing a query sequence to a database of known sequences; in this study, we focus on protein sequences. Position-specific-iterated BLAST (PSI-BLAST) iteratively searches a protein sequence database, using the matches in round i to construct a position-specific score matrix (PSSM) for searching the database in round i + 1. Biegert and Söding developed Context-sensitive BLAST (CS-BLAST), which combines information from searching the sequence database with information derived from a library of short protein profiles to achieve better homology detection than PSI-BLAST, which builds its PSSMs from scratch.
We describe a new method, called domain enhanced lookup time accelerated BLAST (DELTA-BLAST), which searches a database of pre-constructed PSSMs before searching a protein-sequence database, to yield better homology detection. For its PSSMs, DELTA-BLAST employs a subset of NCBI’s Conserved Domain Database (CDD). On a test set derived from ASTRAL, with one round of searching, DELTA-BLAST achieves a ROC5000 of 0.270 vs. 0.116 for CS-BLAST. The performance advantage diminishes in iterated searches, but DELTA-BLAST continues to achieve better ROC scores than CS-BLAST.
DELTA-BLAST is a useful program for the detection of remote protein homologs. It is available under the “Protein BLAST” link at
This article was reviewed by Arcady Mushegian, Nick V. Grishin, and Frank Eisenhaber.
PMCID: PMC3438057  PMID: 22510480
4.  Independent Component Analysis-motivated Approach to Classificatory Decomposition of Cortical Evoked Potentials 
BMC Bioinformatics  2006;7(Suppl 2):S8.
Independent Component Analysis (ICA) proves to be useful in the analysis of neural activity, as it allows for identification of distinct sources of activity. Applied to measurements registered in a controlled setting and under exposure to an external stimulus, it can facilitate analysis of the impact of the stimulus on those sources. The link between the stimulus and a given source can be verified by a classifier that is able to "predict" the condition a given signal was registered under, solely based on the components. However, the ICA's assumption about statistical independence of sources is often unrealistic and turns out to be insufficient to build an accurate classifier. Therefore, we propose to utilize a novel method, based on hybridization of ICA, multi-objective evolutionary algorithms (MOEA), and rough sets (RS), that attempts to improve the effectiveness of signal decomposition techniques by providing them with "classification-awareness."
The preliminary results described here are very promising and further investigation of other MOEAs and/or RS-based classification accuracy measures should be pursued. Even a quick visual analysis of those results can provide an interesting insight into the problem of neural activity analysis.
We present a methodology of classificatory decomposition of signals. One of the main advantages of our approach is the fact that rather than solely relying on often unrealistic assumptions about statistical independence of sources, components are generated in the light of a underlying classification problem itself.
PMCID: PMC1683557  PMID: 17118151

Results 1-4 (4)