PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-11 (11)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
1.  Pfam: the protein families database 
Nucleic Acids Research  2013;42(D1):D222-D230.
Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.
doi:10.1093/nar/gkt1223
PMCID: PMC3965110  PMID: 24288371
2.  Rfam 11.0: 10 years of RNA families 
Nucleic Acids Research  2012;41(D1):D226-D232.
The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.
doi:10.1093/nar/gks1005
PMCID: PMC3531072  PMID: 23125362
4.  The Pfam protein families database 
Nucleic Acids Research  2011;40(D1):D290-D301.
Pfam is a widely used database of protein families, currently containing more than 13 000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the ‘sunburst’ representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.
doi:10.1093/nar/gkr1065
PMCID: PMC3245129  PMID: 22127870
5.  InterPro in 2011: new developments in the family and domain prediction database 
Nucleic Acids Research  2011;40(D1):D306-D312.
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.
doi:10.1093/nar/gkr948
PMCID: PMC3245097  PMID: 22096229
6.  A phase II trial of Triapine® (NSC# 663249) and gemcitabine as second line treatment of advanced non-small cell lung cancer: Eastern Cooperative Oncology Group Study 1503 
Investigational new drugs  2009;28(1):91-97.
Summary
Background
The objective of ECOG 1503 was to determine the response rate of this combination in the second-line treatment of advanced NSCLC.
Methods
Triapine 105 mg/m2 IV on days 1, 8, and 15, and gemcitabine 1,000 mg/m2 on days 1, 8, and 15, of a 28 day cycle.
Results
Eighteen patients enrolled. Three patients were not eligible due to protocol violations. No objective antitumor responses were seen. Three patients (20%) experienced stable disease (90% CI 5.7–44%). Median overall survival: 5.4 months (95% CI 4.2–11.6 months); median time to progression: 1.8 months (95% CI 1.7–3.5 months). Five patients developed acute infusion reactions to Triapine® related to elevated methemoglobinemia. Patients with MDR1 variant genotypes of C3435T experienced superior overall survival compared to non-variants (13.3 vs. 4.3 months, respectively, p=0.023).
Conclusion
This regimen did not demonstrate activity in relapsed NSCLC. Prolonged survival seen with MDR1 variant genotypes is hypothesis-generating.
doi:10.1007/s10637-009-9230-z
PMCID: PMC3045859  PMID: 19238328
Non-small cell lung cancer; Combination chemotherapy; Ribonucleotide reductase; Single nucleotide polymorphism; ATP binding cassette transporter
7.  Rfam: Wikipedia, clans and the “decimal” release 
Nucleic Acids Research  2010;39(Database issue):D141-D145.
The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.
doi:10.1093/nar/gkq1129
PMCID: PMC3013711  PMID: 21062808
8.  The Pfam protein families database 
Nucleic Acids Research  2009;38(Database issue):D211-D222.
Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is ∼100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11 912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).
doi:10.1093/nar/gkp985
PMCID: PMC2808889  PMID: 19920124
9.  Rfam: updates to the RNA families database 
Nucleic Acids Research  2008;37(Database issue):D136-D140.
Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.
doi:10.1093/nar/gkn766
PMCID: PMC2686503  PMID: 18953034
10.  The Pfam protein families database 
Nucleic Acids Research  2007;36(Database issue):D281-D288.
Pfam is a comprehensive collection of protein domains and families, represented as multiple sequence alignments and as profile hidden Markov models. The current release of Pfam (22.0) contains 9318 protein families. Pfam is now based not only on the UniProtKB sequence database, but also on NCBI GenPept and on sequences from selected metagenomics projects. Pfam is available on the web from the consortium members using a new, consistent and improved website design in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/), as well as from mirror sites in France (http://pfam.jouy.inra.fr/) and South Korea (http://pfam.ccbb.re.kr/).
doi:10.1093/nar/gkm960
PMCID: PMC2238907  PMID: 18039703
11.  CKAAPs DB: a Conserved Key Amino Acid Positions DataBase 
Nucleic Acids Research  2002;30(1):409-411.
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. CKAAPs may be important in protein folding and structural stability and function, and hence useful for protein engineering studies. This paper provides an update to the initial report of CKAAPs DB [Li et al. (2001) Nucleic Acids Res., 29, 329–331]. CKAAPs DB contains CKAAPs for the representative set of polypeptide chains derived from the CE and FSSP databases, as well as subdomains (conserved regions of the order of 100 residues within a domain) identified by CE. The new version now offers different perspectives on the CKAAPs. First, CKAAPs are mapped onto their respective Protein Data Bank (PDB) structures rendered by Molscript, providing a spatial context for the CKAAPs. Secondly, CKAAPs may be highlighted within a structure-based sequence alignment, as well as secondary structure alignment. Thirdly, the resulting sequence homologs from the structure alignment may be viewed in alignments colorized based on identities and property groups using Mview. New search capabilities have also been provided for searching by keyword combinations, PDB IDs, EC numbers, GI numbers, LocusLink ID, taxonomy, gene ontology and pathways. A new custom CKAAPs analysis interface has been implemented where a user may change the criteria for inclusion of chains, initiate CKAAPs analysis and retrieve results. CKAAPs DB is accessible through the web at http://ckaaps.sdsc.edu/. Plain text analysis results are available by FTP at ftp://ftp.sdsc.edu/pub/sdsc/biology/ckaap.
PMCID: PMC99066  PMID: 11752351

Results 1-11 (11)