Enter Your Search:
Results 1-2 (2)
Go to page number:
Clear All Filters
Nucleic Acids Research (1)
PLoS ONE (1)
Xu, Ying (2)
Yin, Yanbin (2)
Chen, Xin (1)
Gogarten, Johann Peter (1)
Huang, Jinling (1)
Mao, Xizeng (1)
Yang, Jincai (1)
Zhou, Chan (1)
Year of Publication
Did you mean:
AST: An Automated Sequence-Sampling Method for Improving the Taxonomic Diversity of Gene Phylogenetic Trees
Gogarten, Johann Peter
A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.
dbCAN: a web resource for automated carbohydrate-active enzyme annotation
Nucleic Acids Research
2012;40(Web Server issue):W445-W451.
Carbohydrate-active enzymes (CAZymes) are very important to the biotech industry, particularly the emerging biofuel industry because CAZymes are responsible for the synthesis, degradation and modification of all the carbohydrates on Earth. We have developed a web resource, dbCAN (http://csbl.bmb.uga.edu/dbCAN/annotate.php), to provide a capability for automated CAZyme signature domain-based annotation for any given protein data set (e.g. proteins from a newly sequenced genome) submitted to our server. To accomplish this, we have explicitly defined a signature domain for every CAZyme family, derived based on the CDD (conserved domain database) search and literature curation. We have also constructed a hidden Markov model to represent the signature domain of each CAZyme family. These CAZyme family-specific HMMs are our key contribution and the foundation for the automated CAZyme annotation.
Results 1-2 (2)
Go to page number:
Remove citation from clipboard
Add citation to clipboard
This will clear all selections from your clipboard. Do you wish proceed?
Clipboard is full! Please remove an item and try again.
PubMed Central Canada is a service of the
Canadian Institutes of Health Research
(CIHR) working in partnership with the National Research Council's
national science library
in cooperation with the
National Center for Biotechnology Information
U.S. National Library of Medicine
(NCBI/NLM). It includes content provided to the
PubMed Central International archive
by participating publishers.