|Home | About | Journals | Submit | Contact Us | Français|
We have devised a gene-clustering algorithm that is completely unsupervised in that no parameters need be set by the user, and the clustering of genes is self-optimizing to yield the set of clusters that minimizes within-cluster distance and maximizes between-cluster distance. This algorithm was implemented in Java, and tested on a randomly selected 200-gene subset of 3000 genes from cell-cycle data in S. cerevisiae. AlignACE was used to evaluate the resulting optimized cluster set for upstream cis-regulons. The optimized cluster set was found to be of comparable quality to cluster sets obtained by two established methods (complete linkage and k-means), even when provided with only a small, randomly selected subset of the data (200 vs 3000 genes), and with absolutely no supervision. MAP and specificity scores of the highest ranking motifs identified in the largest clusters were comparable.