PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-2 (2)
 

Clipboard (0)
None
Journals
Authors
Year of Publication
Document Types
1.  Semiparametric prognosis models in genomic studies 
Briefings in Bioinformatics  2010;11(4):385-393.
Development of high-throughput technologies makes it possible to survey the whole genome. Genomic studies have been extensively conducted, searching for markers with predictive power for prognosis of complex diseases such as cancer, diabetes and obesity. Most existing statistical analyses are focused on developing marker selection techniques, while little attention is paid to the underlying prognosis models. In this article, we review three commonly used prognosis models, namely the Cox, additive risk and accelerated failure time models. We conduct simulation and show that gene identification can be unsatisfactory under model misspecification. We analyze three cancer prognosis studies under the three models, and show that the gene identification results, prediction performance of all identified genes combined, and reproducibility of each identified gene are model-dependent. We suggest that in practical data analysis, more attention should be paid to the model assumption, and multiple models may need to be considered.
doi:10.1093/bib/bbp070
PMCID: PMC2905523  PMID: 20123942
genomic studies; semiparametric prognosis models; model comparison
2.  Penalized feature selection and classification in bioinformatics 
Briefings in Bioinformatics  2008;9(5):392-403.
In bioinformatics studies, supervised classification with high-dimensional input variables is frequently encountered. Examples routinely arise in genomic, epigenetic and proteomic studies. Feature selection can be employed along with classifier construction to avoid over-fitting, to generate more reliable classifier and to provide more insights into the underlying causal relationships. In this article, we provide a review of several recently developed penalized feature selection and classification techniques—which belong to the family of embedded feature selection methods—for bioinformatics studies with high-dimensional input. Classification objective functions, penalty functions and computational algorithms are discussed. Our goal is to make interested researchers aware of these feature selection and classification methods that are applicable to high-dimensional bioinformatics data.
doi:10.1093/bib/bbn027
PMCID: PMC2733190  PMID: 18562478
bioinformatics application; feature selection; penalization

Results 1-2 (2)