PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (48)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
more »
1.  A role for cytochrome b5 in the in vivo disposition of anti-cancer and cytochrome P450 probe drugs in mice 
The role of microsomal cytochrome b5 (Cyb5) in defining the rate of drug metabolism and disposition has been intensely debated for several decades. Recently we described mouse models involving the hepatic or global deletion of Cyb5, demonstrating its central role in in vivo drug disposition. We have now used the cytochrome b5 complete null (BCN) model to determine the role of Cyb5 in the metabolism of ten pharmaceuticals metabolised by a range of cytochrome P450s, including five anti-cancer drugs, in vivo and in vitro. The extent to which metabolism was significantly affected by the absence of Cyb5 was substrate-dependent, with AUC increased (75-245%), and clearance decreased (35-72%), for phenacetin, metoprolol and chlorzoxazone. Tolbutamide disposition was not significantly altered by Cyb5 deletion, while for midazolam clearance was decreased by 66%. The absence of Cyb5 had no effect on gefitinib and paclitaxel disposition, while significant changes in the in vivo pharmacokinetics of cyclophosphamide were measured (Cmax and terminal half-life increased 55% and 40%, respectively), tamoxifen (AUClast and Cmax increased 370% and 233%, respectively) and anastrozole (AUC and terminal half-life increased 125% and 62%, respectively; clearance down 80%). These data from provide strong evidence that both hepatic and extra-hepatic Cyb5 levels are an important determinant of in vivo drug disposition catalysed by a range of cytochrome P450s, including currently-prescribed anti-cancer agents, and that individuality in Cyb5 expression could be a significant determinant in rates of drug disposition in man.
doi:10.1124/dmd.113.055277
PMCID: PMC3935460  PMID: 24115751
2.  Structure and computational analysis of a novel protein with metallopeptidase-like and circularly permuted winged-helix-turn-helix domains reveals a possible role in modified polysaccharide biosynthesis 
BMC Bioinformatics  2014;15:75.
Background
CA_C2195 from Clostridium acetobutylicum is a protein of unknown function. Sequence analysis predicted that part of the protein contained a metallopeptidase-related domain. There are over 200 homologs of similar size in large sequence databases such as UniProt, with pairwise sequence identities in the range of ~40-60%. CA_C2195 was chosen for crystal structure determination for structure-based function annotation of novel protein sequence space.
Results
The structure confirmed that CA_C2195 contained an N-terminal metallopeptidase-like domain. The structure revealed two extra domains: an α+β domain inserted in the metallopeptidase-like domain and a C-terminal circularly permuted winged-helix-turn-helix domain.
Conclusions
Based on our sequence and structural analyses using the crystal structure of CA_C2195 we provide a view into the possible functions of the protein. From contextual information from gene-neighborhood analysis, we propose that rather than being a peptidase, CA_C2195 and its homologs might play a role in biosynthesis of a modified cell-surface carbohydrate in conjunction with several sugar-modification enzymes. These results provide the groundwork for the experimental verification of the function.
doi:10.1186/1471-2105-15-75
PMCID: PMC4000134  PMID: 24646163
CA_C2195; Peptidase; DUF4910; DUF2172; HTH_47; Structural genomics
3.  Present and future potential of plant-derived products to control arthropods of veterinary and medical significance 
Parasites & Vectors  2014;7:28.
The use of synthetic pesticides and repellents to target pests of veterinary and medical significance is becoming increasingly problematic. One alternative approach employs the bioactive attributes of plant-derived products (PDPs). These are particularly attractive on the grounds of low mammalian toxicity, short environmental persistence and complex chemistries that should limit development of pest resistance against them.
Several pesticides and repellents based on PDPs are already available, and in some cases widely utilised, in modern pest management. Many more have a long history of traditional use in poorer areas of the globe where access to synthetic pesticides is often limited. Preliminary studies support that PDPs could be more widely used to target numerous medical and veterinary pests, with modes of action often specific to invertebrates.
Though their current and future potential appears significant, development and deployment of PDPs to target veterinary and medical pests is not without issue. Variable efficacy is widely recognised as a restraint to PDPs for pest control. Identifying and developing natural bioactive PDP components in place of chemically less-stable raw or 'whole’ products seems to be the most popular solution to this problem. A limited residual activity, often due to photosensitivity or high volatility, is a further drawback in some cases (though potentially advantageous in others). Nevertheless, encapsulation technologies and other slow-release mechanisms offer strong potential to improve residual activity where needed.
The current review provides a summary of existing use and future potential of PDPs against ectoparasites of veterinary and medical significance. Four main types of PDP are considered (pyrethrum, neem, essential oils and plant extracts) for their pesticidal, growth regulating and repellent or deterrent properties. An overview of existing use and research for each is provided, with direction to more extensive reviews given in many sections. Sections to highlight potential issues, modes of action and emerging and future potential are also included.
doi:10.1186/1756-3305-7-28
PMCID: PMC3905284  PMID: 24428899
Botanical; Phytochemical; Plant-derived product; Insecticide; Acaricide; Pesticide; Repellent
4.  Skylign: a tool for creating informative, interactive logos representing sequence alignments and profile hidden Markov models 
BMC Bioinformatics  2014;15:7.
Background
Logos are commonly used in molecular biology to provide a compact graphical representation of the conservation pattern of a set of sequences. They render the information contained in sequence alignments or profile hidden Markov models by drawing a stack of letters for each position, where the height of the stack corresponds to the conservation at that position, and the height of each letter within a stack depends on the frequency of that letter at that position.
Results
We present a new tool and web server, called Skylign, which provides a unified framework for creating logos for both sequence alignments and profile hidden Markov models. In addition to static image files, Skylign creates a novel interactive logo plot for inclusion in web pages. These interactive logos enable scrolling, zooming, and inspection of underlying values. Skylign can avoid sampling bias in sequence alignments by down-weighting redundant sequences and by combining observed counts with informed priors. It also simplifies the representation of gap parameters, and can optionally scale letter heights based on alternate calculations of the conservation of a position.
Conclusion
Skylign is available as a website, a scriptable web service with a RESTful interface, and as a software package for download. Skylign’s interactive logos are easily incorporated into a web page with just a few lines of HTML markup. Skylign may be found at http://skylign.org.
doi:10.1186/1471-2105-15-7
PMCID: PMC3893531  PMID: 24410852
Alignment logo; Sequence logo; Profile logo; Hmm logo; Logo server; Web logo
5.  iPfam: a database of protein family and domain interactions found in the Protein Data Bank 
Nucleic Acids Research  2013;42(Database issue):D364-D373.
The database iPfam, available at http://ipfam.org, catalogues Pfam domain interactions based on known 3D structures that are found in the Protein Data Bank, providing interaction data at the molecular level. Previously, the iPfam domain–domain interaction data was integrated within the Pfam database and website, but it has now been migrated to a separate database. This allows for independent development, improving data access and giving clearer separation between the protein family and interactions datasets. In addition to domain–domain interactions, iPfam has been expanded to include interaction data for domain bound small molecule ligands. Functional annotations are provided from source databases, supplemented by the incorporation of Wikipedia articles where available. iPfam (version 1.0) contains >9500 domain–domain and 15 500 domain–ligand interactions. The new website provides access to this data in a variety of ways, including interactive visualizations of the interaction data.
doi:10.1093/nar/gkt1210
PMCID: PMC3965099  PMID: 24297255
6.  Pfam: the protein families database 
Nucleic Acids Research  2013;42(Database issue):D222-D230.
Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures.
doi:10.1093/nar/gkt1223
PMCID: PMC3965110  PMID: 24288371
7.  Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila 
BMC Bioinformatics  2013;14:265.
Background
Every genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology.
Results
We analyze a previously uncharacterized Pfam protein family called DUF4424 [Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210 from Legionella pneumophila provides the first structural information pertaining to this family. This protein additionally includes the first representative structure of another Pfam family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure analysis allows us to recognize distant similarities between the DUF4424 domain and individual domains of M1 aminopeptidases and tricorn proteases, which form massive proteasome-like capsids in both archaea and bacteria.
Conclusions
Based on our analyses we hypothesize that the DUF4424 domain may have a role in forming large, multi-component enzyme complexes. We suggest that the YARGH domain may play a role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer membrane lipid or lipopolysaccharide.
doi:10.1186/1471-2105-14-265
PMCID: PMC3848476  PMID: 24004689
Domain of unknown function; Protein family; Protein structure; DUF4424; YARHG domain; Sequence analysis
8.  The first structure in a family of peptidase inhibitors reveals an unusual Ig-like fold 
F1000Research  2013;2:154.
We report the crystal structure solution of the Intracellular Protease Inhibitor (IPI) protein from Bacillus subtilis, which has been reported to be an inhibitor of the intracellular subtilisin Isp1 from the same organism. The structure of IPI is a variant of the all-beta, immunoglobulin (Ig) fold. It is possible that IPI is important for protein-protein interactions, of which inhibition of Isp1 is one. The intracellular nature of ISP is questioned, because an alternative ATG codon in the ipi gene would produce a protein with an N-terminal extension containing a signal peptide. It is possible that alternative initiation exists, producing either an intracellular inhibitor or a secreted form that may be associated with the cell surface.  Homologues of the IPI protein from other species are multi-domain proteins, containing signal peptides and domains also associated with the bacterial cell-surface. The cysteine peptidase inhibitors chagasin and amoebiasin also have Ig-like folds, but their topology differs significantly from that of IPI, and they share no recent common ancestor. A model of IPI docked to Isp1 shows similarities to other subtilisin:inhibitor complexes, particularly where the inhibitor interacts with the peptidase active site.
doi:10.12688/f1000research.2-154.v2
PMCID: PMC3901451  PMID: 24555072
9.  The first structure in a family of peptidase inhibitors reveals an unusual Ig-like fold 
F1000Research  2013;2:154.
We report the crystal structure solution of the Intracellular Protease Inhibitor (IPI) protein from Bacillus subtilis, which has been reported to be an inhibitor of the intracellular subtilisin Isp1 from the same organism. The structure of IPI is a variant of the all-beta, immunoglobulin (Ig) fold. It is possible that IPI is important for protein-protein interactions, of which inhibition of Isp1 is one. The intracellular nature of ISP is questioned, because an alternative ATG codon in the ipi gene would produce a protein with an N-terminal extension containing a signal peptide. It is possible that alternative initiation exists, producing either an intracellular inhibitor or a secreted form that may be associated with the cell surface.  Homologues of the IPI protein from other species are multi-domain proteins, containing signal peptides and domains also associated with the bacterial cell-surface. The cysteine peptidase inhibitors chagasin and amoebiasin also have Ig-like folds, but their topology differs significantly from that of IPI, and they share no recent common ancestor. A model of IPI docked to Isp1 shows similarities to other subtilisin:inhibitor complexes, particularly where the inhibitor interacts with the peptidase active site.
doi:10.12688/f1000research.2-154.v1
PMCID: PMC3901451  PMID: 24555072
11.  The challenge of increasing Pfam coverage of the human proteome 
It is a worthy goal to completely characterize all human proteins in terms of their domains. Here, using the Pfam database, we asked how far we have progressed in this endeavour. Ninety per cent of proteins in the human proteome matched at least one of 5494 manually curated Pfam-A families. In contrast, human residue coverage by Pfam-A families was <45%, with 9418 automatically generated Pfam-B families adding a further 10%. Even after excluding predicted signal peptide regions and short regions (<50 consecutive residues) unlikely to harbour new families, for ∼38% of the human protein residues, there was no information in Pfam about conservation and evolutionary relationship with other protein regions. This uncovered portion of the human proteome was found to be distributed over almost 25 000 distinct protein regions. Comparison with proteins in the UniProtKB database suggested that the human regions that exhibited similarity to thousands of other sequences were often either divergent elements or N- or C-terminal extensions of existing families. Thirty-four per cent of regions, on the other hand, matched fewer than 100 sequences in UniProtKB. Most of these did not appear to share any relationship with existing Pfam-A families, suggesting that thousands of new families would need to be generated to cover them. Also, these latter regions were particularly rich in amino acid compositional bias such as the one associated with intrinsic disorder. This could represent a significant obstacle toward their inclusion into new Pfam families. Based on these observations, a major focus for increasing Pfam coverage of the human proteome will be to improve the definition of existing families. New families will also be built, prioritizing those that have been experimentally functionally characterized.
Database URL: http://pfam.sanger.ac.uk/
doi:10.1093/database/bat023
PMCID: PMC3630804
12.  Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions 
Nucleic Acids Research  2013;41(12):e121.
Detection of protein homology via sequence similarity has important applications in biology, from protein structure and function prediction to reconstruction of phylogenies. Although current methods for aligning protein sequences are powerful, challenges remain, including problems with homologous overextension of alignments and with regions under convergent evolution. Here, we test the ability of the profile hidden Markov model method HMMER3 to correctly assign homologous sequences to >13 000 manually curated families from the Pfam database. We identify problem families using protein regions that match two or more Pfam families not currently annotated as related in Pfam. We find that HMMER3 E-value estimates seem to be less accurate for families that feature periodic patterns of compositional bias, such as the ones typically observed in coiled-coils. These results support the continued use of manually curated inclusion thresholds in the Pfam database, especially on the subset of families that have been identified as problematic in experiments such as these. They also highlight the need for developing new methods that can correct for this particular type of compositional bias.
doi:10.1093/nar/gkt263
PMCID: PMC3695513  PMID: 23598997
13.  Dfam: a database of repetitive DNA based on profile hidden Markov models 
Nucleic Acids Research  2012;41(Database issue):D70-D82.
We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.
doi:10.1093/nar/gks1265
PMCID: PMC3531169  PMID: 23203985
14.  Recent advances in biocuration: Meeting Report from the fifth International Biocuration Conference 
The 5th International Biocuration Conference brought together over 300 scientists to exchange on their work, as well as discuss issues relevant to the International Society for Biocuration’s (ISB) mission. Recurring themes this year included the creation and promotion of gold standards, the need for more ontologies, and more formal interactions with journals. The conference is an essential part of the ISB's goal to support exchanges among members of the biocuration community. Next year's conference will be held in Cambridge, UK, from 7 to 10 April 2013. In the meanwhile, the ISB website provides information about the society's activities (http://biocurator.org), as well as related events of interest.
doi:10.1093/database/bas036
PMCID: PMC3483532  PMID: 23110974
16.  Making your database available through Wikipedia: the pros and cons 
Nucleic Acids Research  2011;40(Database issue):D9-D12.
Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 articles that describe the use of a wiki in relation to a biological database. In this commentary, we discuss how biological databases can be integrated with Wikipedia, thereby utilising the pre-existing infrastructure, tools and above all, large community of authors (or Wikipedians). The limitations to the content that can be included in Wikipedia are highlighted, with examples drawn from articles found in this issue and other wiki-based resources, indicating why other wiki solutions are necessary. We discuss the merits of using open wikis, like Wikipedia, versus other models, with particular reference to potential vandalism. Finally, we raise the question about the future role of dedicated database biocurators in context of the thousands of crowdsourced, community annotations that are now being stored in wikis.
doi:10.1093/nar/gkr1195
PMCID: PMC3245093  PMID: 22144683
17.  The Pfam protein families database 
Nucleic Acids Research  2011;40(Database issue):D290-D301.
Pfam is a widely used database of protein families, currently containing more than 13 000 manually curated protein families as of release 26.0. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/). Here, we report on changes that have occurred since our 2010 NAR paper (release 24.0). Over the last 2 years, we have generated 1840 new families and increased coverage of the UniProt Knowledgebase (UniProtKB) to nearly 80%. Notably, we have taken the step of opening up the annotation of our families to the Wikipedia community, by linking Pfam families to relevant Wikipedia pages and encouraging the Pfam and Wikipedia communities to improve and expand those pages. We continue to improve the Pfam website and add new visualizations, such as the ‘sunburst’ representation of taxonomic distribution of families. In this work we additionally address two topics that will be of particular interest to the Pfam community. First, we explain the definition and use of family-specific, manually curated gathering thresholds. Second, we discuss some of the features of domains of unknown function (also known as DUFs), which constitute a rapidly growing class of families within Pfam.
doi:10.1093/nar/gkr1065
PMCID: PMC3245129  PMID: 22127870
18.  InterPro in 2011: new developments in the family and domain prediction database 
Nucleic Acids Research  2011;40(Database issue):D306-D312.
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public via Web-based interfaces and services. Central to the database are diagnostic models, known as signatures, against which protein sequences can be searched to determine their potential function. InterPro has utility in the large-scale analysis of whole genomes and meta-genomes, as well as in characterizing individual protein sequences. Herein we give an overview of new developments in the database and its associated software since 2009, including updates to database content, curation processes and Web and programmatic interfaces.
doi:10.1093/nar/gkr948
PMCID: PMC3245097  PMID: 22096229
19.  The crystal structure of a bacterial Sufu-like protein defines a novel group of bacterial proteins that are similar to the N-terminal domain of human Sufu 
Sufu (Suppressor of Fused), a two-domain protein, plays a critical role in regulating Hedgehog signaling and is conserved from flies to humans. A few bacterial Sufu-like proteins have previously been identified based on sequence similarity to the N-terminal domain of eukaryotic Sufu proteins, but none have been structurally or biochemically characterized and their function in bacteria is unknown. We have determined the crystal structure of a more distantly related Sufu-like homolog, NGO1391 from Neisseria gonorrhoeae, at 1.4 Å resolution, which provides the first biophysical characterization of a bacterial Sufu-like protein. The structure revealed a striking similarity to the N-terminal domain of human Sufu (r.m.s.d. of 2.6 Å over 93% of the NGO1391 protein), despite an extremely low sequence identity of ∼15%. Subsequent sequence analysis revealed that NGO1391 defines a new subset of smaller, Sufu-like proteins that are present in ∼200 bacterial species and has resulted in expansion of the SUFU (PF05076) family in Pfam.
doi:10.1002/pro.497
PMCID: PMC3005784  PMID: 20836087
Neisseria gonorrhoeae; NGO1391; UniProt Q5F6Z8; Pfam PF05076; suppressor of fused; sufu-like; structural genomics
20.  HMMER web server: interactive sequence similarity searching 
Nucleic Acids Research  2011;39(Web Server issue):W29-W37.
HMMER is a software suite for protein sequence similarity searches using probabilistic methods. Previously, HMMER has mainly been available only as a computationally intensive UNIX command-line tool, restricting its use. Recent advances in the software, HMMER3, have resulted in a 100-fold speed gain relative to previous versions. It is now feasible to make efficient profile hidden Markov model (profile HMM) searches via the web. A HMMER web server (http://hmmer.janelia.org) has been designed and implemented such that most protein database searches return within a few seconds. Methods are available for searching either a single protein sequence, multiple protein sequence alignment or profile HMM against a target sequence database, and for searching a protein sequence against Pfam. The web server is designed to cater to a range of different user expertise and accepts batch uploading of multiple queries at once. All search methods are also available as RESTful web services, thereby allowing them to be readily integrated as remotely executed tasks in locally scripted workflows. We have focused on minimizing search times and the ability to rapidly display tabular results, regardless of the number of matches found, developing graphical summaries of the search results to provide quick, intuitive appraisement of them.
doi:10.1093/nar/gkr367
PMCID: PMC3125773  PMID: 21593126
21.  Clustered Coding Variants in the Glutamate Receptor Complexes of Individuals with Schizophrenia and Bipolar Disorder 
PLoS ONE  2011;6(4):e19011.
Current models of schizophrenia and bipolar disorder implicate multiple genes, however their biological relationships remain elusive. To test the genetic role of glutamate receptors and their interacting scaffold proteins, the exons of ten glutamatergic ‘hub’ genes in 1304 individuals were re-sequenced in case and control samples. No significant difference in the overall number of non-synonymous single nucleotide polymorphisms (nsSNPs) was observed between cases and controls. However, cluster analysis of nsSNPs identified two exons encoding the cysteine-rich domain and first transmembrane helix of GRM1 as a risk locus with five mutations highly enriched within these domains. A new splice variant lacking the transmembrane GPCR domain of GRM1 was discovered in the human brain and the GRM1 mutation cluster could perturb the regulation of this variant. The predicted effect on individuals harbouring multiple mutations distributed in their ten hub genes was also examined. Diseased individuals possessed an increased load of deleteriousness from multiple concurrent rare and common coding variants. Together, these data suggest a disease model in which the interplay of compound genetic coding variants, distributed among glutamate receptors and their interacting proteins, contribute to the pathogenesis of schizophrenia and bipolar disorders.
doi:10.1371/journal.pone.0019011
PMCID: PMC3084736  PMID: 21559497
22.  Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation 
PLoS ONE  2011;6(4):e18910.
The accelerating growth in the number of protein sequences taxes both the computational and manual resources needed to analyze them. One approach to dealing with this problem is to minimize the number of proteins subjected to such analysis in a way that minimizes loss of information. To this end we have developed a set of Representative Proteomes (RPs), each selected from a Representative Proteome Group (RPG) containing similar proteomes calculated based on co-membership in UniRef50 clusters. A Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold (CMT) are provided to allow users to decrease or increase the granularity of the sequence space based on their requirements. We find that a CMT of 55% (RP55) most closely follows standard taxonomic classifications. Further analysis of this set reveals that sequence space is reduced by more than 80% relative to UniProtKB, while retaining both sequence diversity (over 95% of InterPro domains) and annotation information (93% of experimentally characterized proteins). All sets can be browsed and are available for sequence similarity searches and download at http://www.proteininformationresource.org/rps, while the set of 637 RPs determined using a 55% CMT are also available for text searches. Potential applications include sequence similarity searches, protein classification and targeted protein annotation and characterization.
doi:10.1371/journal.pone.0018910
PMCID: PMC3083393  PMID: 21556138
23.  Rfam: Wikipedia, clans and the “decimal” release 
Nucleic Acids Research  2010;39(Database issue):D141-D145.
The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.
doi:10.1093/nar/gkq1129
PMCID: PMC3013711  PMID: 21062808
24.  The crystal structure of a bacterial Sufu-like protein defines a novel group of bacterial proteins that are similar to the N-terminal domain of human Sufu 
Sufu (Suppressor of Fused), a two-domain protein, plays a critical role in regulating Hedgehog signaling and is conserved from flies to humans. A few bacterial Sufu-like proteins have previously been identified based on sequence similarity to the N-terminal domain of eukaryotic Sufu proteins, but none have been structurally or biochemically characterized and their function in bacteria is unknown. We have determined the crystal structure of a more distantly related Sufu-like homolog, NGO1391 from Neisseria gonorrhoeae, at 1.4 Å resolution, which provides the first biophysical characterization of a bacterial Sufu-like protein. The structure revealed a striking similarity to the N-terminal domain of human Sufu (r.m.s.d. of 2.6 Å over 93% of the NGO1391 protein), despite an extremely low sequence identity of ∼15%. Subsequent sequence analysis revealed that NGO1391 defines a new subset of smaller, Sufu-like proteins that are present in ∼200 bacterial species and has resulted in expansion of the SUFU (PF05076) family in Pfam.
doi:10.1002/pro.497
PMCID: PMC3005784  PMID: 20836087
Neisseria gonorrhoeae; NGO1391; UniProt Q5F6Z8; Pfam PF05076; suppressor of fused; sufu-like; structural genomics
25.  Cytochrome b5 null mouse: a new model for studying inherited skin disorders and the role of unsaturated fatty acids in normal homeostasis 
Transgenic Research  2010;20(3):491-502.
Microsomal cytochrome b5 is a ubiquitous, 15.2 kDa haemoprotein implicated in a number of cellular processes such as fatty acid desaturation, drug metabolism, steroid hormone biosynthesis and methaemoglobin reduction. As a consequence of these functions this protein has been considered essential for life. Most of the ascribed functions of cytochrome b5, however, stem from in vitro studies and for this reason we have carried out a germline deletion of this enzyme. We have unexpectedly found that cytochrome b5 null mice were viable and fertile, with pups being born at expected Mendelian ratios. However, a number of intriguing phenotypes were identified, including altered drug metabolism, methaemoglobinemia and disrupted steroid hormone homeostasis. In addition to these previously identified roles for this protein, cytochrome b5 null mice displayed skin defects closely resembling those observed in autosomal recessive congenital ichthyosis and retardation of neonatal development, indicating that this protein, possibly as a consequence of its role in the de novo biosynthesis of unsaturated fatty acids, plays a central role in skin development and neonatal nutrition. Results from fatty acid profile analysis of several tissues suggest that cytochrome b5 plays a role controlling saturated/unsaturated homeostasis. These data demonstrate that regional concentrations of unsaturated fatty acids are controlled by endogenous metabolic pathways and not by diet alone.
Electronic supplementary material
The online version of this article (doi:10.1007/s11248-010-9426-1) contains supplementary material, which is available to authorized users.
doi:10.1007/s11248-010-9426-1
PMCID: PMC3090575  PMID: 20676935
Cytochrome b5; Ichthyosis; Methaemoglobinemia; Nutrition; Skin; Unsaturated fatty acids

Results 1-25 (48)