Bioinformatics and genomics play fundamental roles in our understanding and designs of biological systems and therapic medicine at all levels of organization, from molecular biology, life sciences to engineering and computer sciences. Bioinformatics and genomics are “bourgeoning out” fields that study the sequence with the development of algorithms, computational and statistical techniques, and theories to solve formal and practical biomedical problems.
Obviously, the Biocomp 2007 would not have achieved such a success without the hard work by contributors and organizers. Organizing such a major academic event in the fields is not possible without contributions from members of program, scientific review, advisory and steering committee. Thanks must be given to them for their professionalisms. We must express our sincere gratitude program and scientific review committee members for their high-quality timely evaluation of more than 400 full-length regular research papers. We must express our sincere gratitude to the program and scientific review committee members for their high-quality timely evaluation of more than 400 full-length regular research papers. We must extend our sincere thanks to all to the conference co-chairs, vice-chairs, session chairs, organizers and committee members for their dedication and professional services. In particular, Michelle M. Zhu, Youping Deng, Hamid R. Arabnia, Jack Y. Yang, Mary Qu Yang, Rattikorn Hewett, Yunlong Liu, Jianlin Cheng, Vladimir N Uversky, My Tra Thai, Yufang Jin and the scientific review committee members dedicated themselves for the scientific reviews.. Hamid R. Arabnia managed the paper submission system and handled various important organizing and academic affairs; Jack Y. Yang and Mary Qu Yang initiated the special genomics sessions with new components that are defined dynamically in response to specific needs of inter/multidisciplinary cutting-edge research and education, therefore, Mary Qu Yang, Jack Y. Yang and Hamid R. Arabnia initiated and arranged the organization of cutting-edge research workshops, keynote lectures, special sessions and poster presentations in addition to the traditional tutorial lectures. The Biocomp committee would like to acknowledge our appreciation of International Society of Intelligent Biological Medicine (ISIBM) for their academic support and co-sponsorship; we must express our sincere appreciation to the excellent professional services provided to the Biocomp by Isobel Peters at BioMed Central Ltd for the BMC Genomics supplementary issue.
Biocomp received submissions both from the presenters at the conference and from non-presenters. Submitted manuscripts were reviewed by at least three referees. The quality of each paper was evaluated based on the contribution to genomics and bioinformatics. The accepted papers in the specific issue covered a broad range of subject areas and can be mainly divided into the following categories:
Microarray data analysis
Eight papers discuss novel mathematical or statistical approaches to analyze microarray datasets. Gu and Liu [1
] proposed a Bayesian biclustering model, and implemented a Gibbs sampling procedure and illustrated that such Bayesian biclustering approach can effectively identify multiple clusters from gene expression data. Zhu and Wu [2
] proposed a parallel computation-based random matrix theory approach to analyze the cross correlations of gene expression data in an entirely automatic and objective manner to eliminate the ambiguities and subjectivity inherent to human decisions. Yang et al.
] conducted extensive physiological and transcriptomic studies to characterize Fur in Shewanella oneidensis, with regard to iron and acid tolerance response with their own microarray expression datasets. Xu et al.
] developed novel graph-based methods to combine multiple microarray datasets to discover co-expression network modules related to cancer disease. Pirooznia et al.
] compared various microarray classification methods including; SVM, RBF Neural Nets, MLP Neural Nets, Bayesian, Decision Tree and Random Forrest methods. Mao et al.
] investigated the transcriptomic profiling of three yeast mutants lacking C2H2 zinc finger prote
and found out that the gene expression patterns were dramatically different between wild type and the mutants. Gong et al.
] studied the effect of explosive compounds such as TNT and RDX on the transcriptomic pattern of earthworms. Deng et al.
] proposed an algorithms based on Integer Linear Programming to select a minimum number of non-unique probes for microarray experiment using d-disjunct matrices.
Genome and sequence analysis
Li et al.
] proposed an effective algorithm to enable rapid mapping of millions of oligonucleotide fragments to a genome of any length. They were able to achieve at least one order of magnitude speed increase over existing tools by using bit shifting operation.
Liu et al.
] quantified the effects of recombination on populations by estimating the minimum number of recombination events in the history of a DNA sample. Two new algorithms were proposed for estimating the lower bound under the infinite site model. The new lower bounds can also be extended to allow for recurrent mutations. Yue et al.
] extended current method GRAPPA for reconstructing phylogeny from genome rearrangements and develop a new method GRAPPA-IR to analyze gene rearrangement from chloroplast genomes with inverted repeat.
Protein structure prediction and classification
Yang et al.
] exploited machine learning techniques including variants of Self-Organizing Global Ranking, a decision tree, and a support vector machine algorithms to predict the tertiary structure of transmembrane proteins. Hecker et al.
] developed a state of the art protein disorder predictor and tested it on a large protein disorder dataset created from Protein Data Bank. The relationship of sensitivity and specificity is also evaluated. Habib et al.
] presented a new SVM based approach to predict the subcellular locations based on amino acid and amino acid pair composition. More protein features can be taken into consideration and consequently improves the accuracy significantly. Wang et al.
] discussed an empirical approach to specify the localization of protein binding regions utilizing information including the distribution pattern of the detected RNA fragments and the sequence specificity of RNase digestion.
Gene regulation elements analysis
Yang et al.
] extended their previous work, which identified candidate bidirectional promoters in the human genome, to map the orthologous promoter regions in the mouse genome. It was shown that bidirectional promoters can be classified apart from other genomic features including non-bidirectional promoters. Chen et al.
] developed an analytical method to identify a thermodynamic model that best describes the mode of transcription factor (TF)-TF interaction among a set of TFs for target genes. Wang et al.
] conducted research to simultaneously identify transcription factor and microRNA (miRNA) binding sites from gene expression microarray database. Two models for predicting the most influential cis
-acting elements under a given biological condition, and estimating the effects of those elements on gene expression levels are proposed.
Disease classification using machine learning techniques
Yang et al.
] developed a multi-task learning technique based on genetic algorithm to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features. Experimental results demonstrated that this approach is effective and perform better than other heuristic methods. Liu et al.
] developed a feature selection method to combines supervised learning and statistical measures for the chosen candidate features/SNPs to reconcile the redundancy information and, in doing so, improve the classification performance in association studies. A Support Vector based Recursive Feature Addition (SVRFA) scheme is also proposed to aid SNP-disease association analysis. Yang et al.
] developed an intelligent decision system using machine learning techniques and markers to characterize tissue as cancerous, non-cancerous or borderline. These algorithms can detect microscopic pathological changes based on features derived from gene expression levels and metabolic profiles.
Biological network construction
Hub proteins in a protein network can bind to many different protein partners to regulate and control a wide variety of physiological processes. Oldfield et al.
] studied protein intrinsic disorder arising from structural plasticity or flexibility and illustrated how such intrinsic disorder can provide a means for hubs to associate with many partners. Jin et al.
] presented their work on a nonlinear control and stability analysis of genetic regulatory networks. Such control scheme can make the genetic regulatory network to get to desired levels by adjusting transcriptional rates. This research the can also be used to design model-based experiments for gene expression profiles regulation.
Genome and database search tools
Dai et al.
] developed a visual editor for profile Hidden Markov Models (HMMEditor), which can visualize the profile HMM architecture, transition probabilities, and emission probabilities. As open-source software, it serves as a useful tool for biological sequence analysis and modeling. Vanteru et al.
] introduced semantics enabled technique to link the PubMed to the Gene Ontology for ontology-based browsing. Latent Semantic Analysis (LSA) framework is used to semantically interface PubMed abstracts to the Gene Ontology for better search performance since semantics is introduced.