Palmoplantar keratodermas (PPKs) are a group of disorders that are diagnostically and therapeutically problematic in dermatogenetics1-3. Punctate PPKs are characterized by circumscribed hyperkeratotic lesions on palms and soles with considerable heterogeneity. In 18 families with autosomal dominant punctate PPK (OMIM #148600), we report heterozygous loss-of-function mutations in AAGAB, encoding alpha- and gamma-adaptin binding protein p34, at a previously linked locus on 15q22. p34, a cytosolic protein with a Rab-like GTPase domain, was shown to bind both clathrin adaptor protein complexes, indicative of a role in membrane traffic. Ultrastucturally, lesional epidermis showed abnormalities in intracellular vesicle biology. Immunohistochemistry showed hyperproliferation within the punctate lesions. Knockdown of p34 in keratinocytes led to increased cell division, which was linked to greatly increased epidermal growth factor receptor (EGFR) protein expression and tyrosine phosphorylation. We hypothesize that p34 deficiency may impair endocytic recycling of growth factor receptors such as EGFR, leading to increased signaling and proliferation.
Atopic dermatitis (AD) is a major inflammatory condition of the skin caused by inherited skin barrier deficiency, with mutations in the filaggrin gene predisposing to development of AD. Support for barrier deficiency initiating AD came from flaky tail mice, which have a frameshift mutation in Flg and also carry an unknown gene, matted, causing a matted hair phenotype.
We sought to identify the matted mutant gene in mice and further define whether mutations in the human gene were associated with AD.
A mouse genetics approach was used to separate the matted and Flg mutations to produce congenic single-mutant strains for genetic and immunologic analysis. Next-generation sequencing was used to identify the matted gene. Five independently recruited AD case collections were analyzed to define associations between single nucleotide polymorphisms (SNPs) in the human gene and AD.
The matted phenotype in flaky tail mice is due to a mutation in the Tmem79/Matt gene, with no expression of the encoded protein mattrin in the skin of mutant mice. Mattft mice spontaneously have dermatitis and atopy caused by a defective skin barrier, with mutant mice having systemic sensitization after cutaneous challenge with house dust mite allergens. Meta-analysis of 4,245 AD cases and 10,558 population-matched control subjects showed that a missense SNP, rs6694514, in the human MATT gene has a small but significant association with AD.
In mice mutations in Matt cause a defective skin barrier and spontaneous dermatitis and atopy. A common SNP in MATT has an association with AD in human subjects.
Allergy; association; atopic dermatitis; atopy; eczema; filaggrin; flaky tail; Matt; mattrin; mouse; mutation; Tmem79; AD, Atopic dermatitis; DM, Double mutant; FLG, Filaggrin; HDM, House dust mite; hpf, High-power field; MAPEG, Membrane-associated proteins in eicosanoid and glutathione metabolism; OR, Odds ratio; SNP, Single nucleotide polymorphism; TEWL, Transepidermal water loss; WT, Wild-type
Alternative cleavage and polyadenylation influence the coding and regulatory potential of mRNAs and where transcription termination occurs. Although widespread, few regulators of this process are known. The Arabidopsis thaliana protein FPA is a rare example of a trans-acting regulator of poly(A) site choice. Analysing fpa mutants therefore provides an opportunity to reveal generic consequences of disrupting this process. We used direct RNA sequencing to quantify shifts in RNA 3′ formation in fpa mutants. Here we show that specific chimeric RNAs formed between the exons of otherwise separate genes are a striking consequence of loss of FPA function. We define intergenic read-through transcripts resulting from defective RNA 3′ end formation in fpa mutants and detail cryptic splicing and antisense transcription associated with these read-through RNAs. We identify alternative polyadenylation within introns that is sensitive to FPA and show FPA-dependent shifts in IBM1 poly(A) site selection that differ from those recently defined in mutants defective in intragenic heterochromatin and DNA methylation. Finally, we show that defective termination at specific loci in fpa mutants is shared with dicer-like 1 (dcl1) or dcl4 mutants, leading us to develop alternative explanations for some silencing roles of these proteins. We relate our findings to the impact that altered patterns of 3′ end formation can have on gene and genome organisation.
The ends of almost all eukaryotic protein-coding genes are defined by a poly(A) signal. When genes are transcribed into mRNA by RNA polymerase II, the poly(A) signal guides cleavage of the precursor mRNA at a particular site; this is accompanied by the addition of a poly(A) tail to the mRNA and termination of transcription. Many genes have more than one poly(A) signal and the regulated choice of which to select can effectively determine what the gene will code for, how the gene can be regulated and where transcription termination occurs. We discovered a rare example of a regulator of poly(A) site choice, called FPA, while studying flower development in the model plant Arabidopsis thaliana. Studying FPA therefore provides an opportunity to understand not only its roles in plant biology but also the generic consequences of disrupting alternative polyadenylation. In this study, we use a technique called direct RNA sequencing to quantify genome-wide shifts in poly(A) site selection in plants that lack FPA function. One of our most striking findings is that in the absence of FPA we detect chimeric RNAs formed between two otherwise separate and well-characterised genes.
RNA-binding proteins (RBPs) play an important role in plant host-microbe interactions. In this study, we show that the plant RBP known as FPA, which regulates 3′-end mRNA polyadenylation, negatively regulates basal resistance to bacterial pathogen Pseudomonas syringae in Arabidopsis. A custom microarray analysis reveals that flg22, a peptide derived from bacterial flagellins, induces expression of alternatively polyadenylated isoforms of mRNA encoding the defence-related transcriptional repressor ETHYLENE RESPONSE FACTOR 4 (ERF4), which is regulated by FPA. Flg22 induces expression of a novel isoform of ERF4 that lacks the ERF-associated amphiphilic repression (EAR) motif, while FPA inhibits this induction. The EAR-lacking isoform of ERF4 acts as a transcriptional activator in vivo and suppresses the flg22-dependent reactive oxygen species burst. We propose that FPA controls use of proximal polyadenylation sites of ERF4, which quantitatively limit the defence response output.
CudA, a nuclear protein required for Dictyostelium prespore-specific gene expression, binds in vivo to the promoter of the cotC prespore gene. A 14 nucleotide region of the cotC promoter binds CudA in vitro and ECudA, an Entamoeba CudA homologue, also binds to this site. The CudA and ECudA DNA-binding sites contain a dyad and, consistent with a symmetrical binding site, CudA forms a homodimer in the yeast two-hybrid system. Mutation of CudA binding sites within the cotC promoter reduces expression from cotC in prespore cells. The CudA and ECudA proteins share a 120 amino acid core of homology, and clustered point mutations introduced into two highly conserved motifs within the ECudA core region decrease its specific DNA binding in vitro. This region, the presumptive DNA-binding domain, is similar in sequence to domains in two Arabidopsis proteins and one Oryza protein. Significantly, these are the only proteins in the two plant species that contain an SH2 domain. Such a structure, with a DNA-binding domain located upstream of an SH2 domain, suggests that the plant proteins are orthologous to metazoan STATs. Consistent with this notion, the DNA sequence of the CudA half site, GAA, is identical to metazoan STAT half sites, although the relative positions of the two halves of the dyad are reversed. These results define a hitherto unrecognised class of transcription factors and suggest a model for the evolution of STATs and their DNA-binding sites.
Dictyostelium; CudA; Amoeboza; Plant STATs; SH2 domains
It has recently been shown that RNA 3′ end formation plays a more widespread role in controlling gene expression than previously thought. In order to examine the impact of regulated 3′ end formation genome-wide we applied direct RNA sequencing to A. thaliana. Here we show the authentic transcriptome in unprecedented detail and how 3′ end formation impacts genome organization. We reveal extreme heterogeneity in RNA 3′ ends, discover previously unrecognized non-coding RNAs and propose widespread re-annotation of the genome. We explain the origin of most poly(A)+ antisense RNAs and identify cis-elements that control 3′ end formation in different registers. These findings are essential to understand what the genome actually encodes, how it is organized and the impact of regulated 3′ end formation on these processes.
Small nucleolar RNAs (snoRNAs) function mainly as guides for the post-transcriptional modification of ribosomal RNAs (rRNAs). In recent years, several studies have identified a wealth of small fragments (<35 nt) derived from snoRNAs (termed sdRNAs) that stably accumulate in the cell, some of which may regulate splicing or translation. A comparison of human small RNA deep sequencing data sets reveals that box C/D sdRNA accumulation patterns are conserved across multiple cell types although the ratio of the abundance of different sdRNAs from a given snoRNA varies. sdRNA profiles of many snoRNAs are specific and resemble the cleavage profiles of miRNAs. Many do not show characteristics of general RNA degradation, as seen for the accumulation of small fragments derived from snRNA or rRNA. While 53% of the sdRNAs contain an snoRNA box C motif and boxes D and D′ are also common in sdRNAs (54%), relatively few (12%) contain a full snoRNA guide region. One box C/D snoRNA, HBII-180C, was analysed in greater detail, revealing the presence of C′ box-containing sdRNAs complementary to several pre-messenger RNAs (pre-mRNAs) including FGFR3. Functional analyses demonstrated that this region of HBII-180C can influence the alternative splicing of FGFR3 pre-mRNA, supporting a role for some snoRNAs in the regulation of splicing.
► Identifies key considerations in target selection and optimisation. ► Approaches to assign useful protein features and structure/function relationships. ► Comparison of latest crystallisation propensity predictors on nonredundant data. ► Discusses single point of reference target selection/optimisation resources. ► Guidance on using the SSPF Target Optimisation Utility (TarO).
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline.
Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.
MSA, Multiple Sequence Alignment; PTM, Post Translational Modification; SSPF, Scottish Structural Proteomics Facility; MCC, Matthew’s correlation coefficient; AROC, Area Under the Receiver Operator Characteristic curve; Target selection; Crystallisation; Structural genomics; Structural biology; Bioinformatics; Construct design
Nucleolar localization sequences (NoLSs) are short targeting sequences responsible for the localization of proteins to the nucleolus. Given the large number of proteins experimentally detected in the nucleolus and the central role of this subnuclear compartment in the cell, NoLSs are likely to be important regulatory elements controlling cellular traffic. Although many proteins have been reported to contain NoLSs, the systematic characterization of this group of targeting motifs has only recently been carried out.
Here, we describe NoD, a web server and a command line program that predicts the presence of NoLSs in proteins. Using the web server, users can submit protein sequences through the NoD input form and are provided with a graphical output of the NoLS score as a function of protein position. While the web server is most convenient for making prediction for just a few proteins, the command line version of NoD can return predictions for complete proteomes. NoD is based on our recently described human-trained artificial neural network predictor. Through stringent independent testing of the predictor using available experimentally validated NoLS-containing eukaryotic and viral proteins, the NoD sensitivity and positive predictive value were estimated to be 71% and 79% respectively.
NoD is the first tool to provide predictions of nucleolar localization sequences in diverse eukaryotes and viruses. NoD can be run interactively online at http://www.compbio.dundee.ac.uk/nod or downloaded to use locally.
nucleolus; protein targeting signal; protein localization; NoD web server
Summary: JABAWS is a web services framework that simplifies the deployment of web services for bioinformatics. JABAWS:MSA provides services for five multiple sequence alignment (MSA) methods (Probcons, T-coffee, Muscle, Mafft and ClustalW), and is the system employed by the Jalview multiple sequence analysis workbench since version 2.6. A fully functional, easy to set up server is provided as a Virtual Appliance (VA), which can be run on most operating systems that support a virtualization environment such as VMware or Oracle VirtualBox. JABAWS is also distributed as a Web Application aRchive (WAR) and can be configured to run on a single computer and/or a cluster managed by Grid Engine, LSF or other queuing systems that support DRMAA. JABAWS:MSA provides clients full access to each application's parameters, allows administrators to specify named parameter preset combinations and execution limits for each application through simple configuration files. The JABAWS command-line client allows integration of JABAWS services into conventional scripts.
Availability and Implementation: JABAWS is made freely available under the Apache 2 license and can be obtained from: http://www.compbio.dundee.ac.uk/jabaws.
Staphylococcus aureus is a major human pathogen and strains resistant to existing treatments continue to emerge. Development of novel treatments is therefore important. Antimicrobial peptides represent a source of potential novel antibiotics to combat resistant bacteria such as Methicillin-Resistant Staphylococcus aureus (MRSA). A promising antimicrobial peptide is ranalexin, which has potent activity against Gram-positive bacteria, and particularly S. aureus. Understanding mode of action is a key component of drug discovery and network biology approaches enable a global, integrated view of microbial physiology, including mechanisms of antibiotic killing. We developed a systems-wide functional association network approach to integrate proteome and transcriptome profiles, enabling study of drug resistance and mode of action.
The functional association network was constructed by Bayesian logistic regression, providing a framework for identification of antimicrobial peptide (ranalexin) response modules from S. aureus MRSA-252 transcriptome and proteome profiling. These signatures of ranalexin treatment revealed multiple killing mechanisms, including cell wall activity. Cell wall effects were supported by gene disruption and osmotic fragility experiments. Furthermore, twenty-two novel virulence factors were inferred, while the VraRS two-component system and PhoU-mediated persister formation were implicated in MRSA tolerance to cationic antimicrobial peptides.
This work demonstrates a powerful integrative approach to study drug resistance and mode of action. Our findings are informative to the development of novel therapeutic strategies against Staphylococcus aureus and particularly MRSA.
The SWI/SNF complex acts to constrain distribution of the centromeric histone variant Cse4
The SWI/SNF complex has an important role in regulating chromatin structure during transcriptional activation and DNA repair. Here, the SWI/SNF complex is also involved in the organisation of centromeric chromatin and prevention of the ectopic deposition of centromeric histone variants.
In order to gain insight into the function of the Saccharomyces cerevisiae SWI/SNF complex, we have identified DNA sequences to which it is bound genomewide. One surprising observation is that the complex is enriched at the centromeres of each chromosome. Deletion of the gene encoding the Snf2 subunit of the complex was found to cause partial redistribution of the centromeric histone variant Cse4 to sites on chromosome arms. Cultures of snf2Δ yeast were found to progress through mitosis slowly. This was dependent on the mitotic checkpoint protein Mad2. In the absence of Mad2, defects in chromosome segregation were observed. In the absence of Snf2, chromatin organisation at centromeres is less distinct. In particular, hypersensitive sites flanking the Cse4 containing nucleosomes are less pronounced. Furthermore, SWI/SNF complex was found to be especially effective in the dissociation of Cse4 containing chromatin in vitro. This suggests a role for Snf2 in the maintenance of point centromeres involving the removal of Cse4 from ectopic sites.
centromere; chromatin; Cse4; nucleosome; SWI/SNF
Although primarily known as the site of ribosome subunit production, the nucleolus is involved in numerous and diverse cellular processes. Recent large-scale proteomics projects have identified thousands of human proteins that associate with the nucleolus. However, in most cases, we know neither the fraction of each protein pool that is nucleolus-associated nor whether their association is permanent or conditional.
To describe the dynamic localisation of proteins in the nucleolus, we investigated the extent of nucleolar association of proteins by first collating an extensively curated literature-derived dataset. This dataset then served to train a probabilistic predictor which integrates gene and protein characteristics. Unlike most previous experimental and computational studies of the nucleolar proteome that produce large static lists of nucleolar proteins regardless of their extent of nucleolar association, our predictor models the fluidity of the nucleolus by considering different classes of nucleolar-associated proteins. The new method predicts all human proteins as either nucleolar-enriched, nucleolar-nucleoplasmic, nucleolar-cytoplasmic or non-nucleolar. Leave-one-out cross validation tests reveal sensitivity values for these four classes ranging from 0.72 to 0.90 and positive predictive values ranging from 0.63 to 0.94. The overall accuracy of the classifier was measured to be 0.85 on an independent literature-based test set and 0.74 using a large independent quantitative proteomics dataset. While the three nucleolar-association groups display vastly different Gene Ontology biological process signatures and evolutionary characteristics, they collectively represent the most well characterised nucleolar functions.
Our proteome-wide classification of nucleolar association provides a novel representation of the dynamic content of the nucleolus. This model of nucleolar localisation thus increases the coverage while providing accurate and specific annotations of the nucleolar proteome. It will be instrumental in better understanding the central role of the nucleolus in the cell and its interaction with other subcellular compartments.
There are two main classes of small nucleolar RNAs (snoRNAs): the box C/D snoRNAs and the box H/ACA snoRNAs that function as guide RNAs to direct sequence-specific modification of rRNA precursors and other nucleolar RNA targets. A previous computational and biochemical analysis revealed a possible evolutionary relationship between miRNA precursors and some box H/ACA snoRNAs. Here, we investigate a similar evolutionary relationship between a subset of miRNA precursors and box C/D snoRNAs. Computational analyses identified 84 intronic miRNAs that are encoded within either box C/D snoRNAs, or in precursors showing similarity to box C/D snoRNAs. Predictions of the folded structures of these box C/D snoRNA-like miRNA precursors resemble the structures of known box C/D snoRNAs, with the boxes C and D often in close proximity in the folded molecule. All five box C/D snoRNA-like miRNA precursors tested (miR-27b, miR-16-1, mir-28, miR-31 and let-7g) bind to fibrillarin, a specific protein component of functional box C/D snoRNP complexes. The data suggest that a subset of small regulatory RNAs may have evolved from box C/D snoRNAs.
Although the nucleolar localization of proteins is often believed to be mediated primarily by non-specific retention to core nucleolar components, many examples of short nucleolar targeting sequences have been reported in recent years. In this article, 46 human nucleolar localization sequences (NoLSs) were collated from the literature and subjected to statistical analysis. Of the residues in these NoLSs 48% are basic, whereas 99% of the residues are predicted to be solvent-accessible with 42% in α-helix and 57% in coil. The sequence and predicted protein secondary structure of the 46 NoLSs were used to train an artificial neural network to identify NoLSs. At a true positive rate of 54%, the predictor’s overall false positive rate (FPR) is estimated to be 1.52%, which can be broken down to FPRs of 0.26% for randomly chosen cytoplasmic sequences, 0.80% for randomly chosen nucleoplasmic sequences and 12% for nuclear localization signals. The predictor was used to predict NoLSs in the complete human proteome and 10 of the highest scoring previously unknown NoLSs were experimentally confirmed. NoLSs are a prevalent type of targeting motif that is distinct from nuclear localization signals and that can be computationally predicted.
In this manuscript we describe the characterisation of human snoRNAs that co-purify with nucleoli and develop a new vector based system for targeted gene knock down. We demonstrate that this novel vector system (snoMEN) can deliver effective, sequence-specific knock down of endogenous cellular genes as well as GFP and GFP-fusion proteins.
Human small nucleolar RNAs (snoRNAs) that copurify with nucleoli isolated from HeLa cells have been characterized. Novel fibrillarin-associated snoRNAs were detected that allowed the creation of a new vector system for the targeted knockdown of one or more genes in mammalian cells. The snoMEN (snoRNA modulator of gene expressioN) vector technology is based on snoRNA HBII-180C, which contains an internal sequence that can be manipulated to make it complementary to RNA targets. Gene-specific knockdowns are demonstrated for endogenous cellular proteins and for G/YFP-fusion proteins. Multiplex snoMEN vectors coexpress multiple snoRNAs in one transcript, targeted either to different genes or to different sites in the same gene. Protein replacement snoMEN vectors can express a single transcript combining cDNA for a tagged protein with introns containing cognate snoRNAs targeted to knockdown the endogenous cellular protein. We foresee applications for snoMEN vectors in basic gene expression research, target validation, and gene therapy.
MicroRNAs (miRNAs) and small nucleolar RNAs (snoRNAs) are two classes of small non-coding regulatory RNAs, which have been much investigated in recent years. While their respective functions in the cell are distinct, they share interesting genomic similarities, and recent sequencing projects have identified processed forms of snoRNAs that resemble miRNAs. Here, we investigate a possible evolutionary relationship between miRNAs and box H/ACA snoRNAs. A comparison of the genomic locations of reported miRNAs and snoRNAs reveals an overlap of specific members of these classes. To test the hypothesis that some miRNAs might have evolved from snoRNA encoding genomic regions, reported miRNA-encoding regions were scanned for the presence of box H/ACA snoRNA features. Twenty miRNA precursors show significant similarity to H/ACA snoRNAs as predicted by snoGPS. These include molecules predicted to target known ribosomal RNA pseudouridylation sites in vivo for which no guide snoRNA has yet been reported. The predicted folded structures of these twenty H/ACA snoRNA-like miRNA precursors reveal molecules which resemble the structures of known box H/ACA snoRNAs. The genomic regions surrounding these predicted snoRNA-like miRNAs are often similar to regions around snoRNA retroposons, including the presence of transposable elements, target site duplications and poly (A) tails. We further show that the precursors of five H/ACA snoRNA-like miRNAs (miR-151, miR-605, mir-664, miR-215 and miR-140) bind to dyskerin, a specific protein component of functional box H/ACA small nucleolar ribonucleoprotein complexes suggesting that these molecules have retained some H/ACA snoRNA functionality. The detection of small RNA molecules that share features of miRNAs and snoRNAs suggest that these classes of RNA may have an evolutionary relationship.
The major functions known for RNA were long believed to be either messenger RNAs, which function as intermediates between genes and proteins, or ribosomal RNAs and transfer RNAs which carry out the translation process. In recent years, however, newly discovered classes of small RNAs have been shown to play important cellular roles. These include microRNAs (miRNAs), which can regulate the production of specific proteins, and small nucleolar RNAs (snoRNAs), which recognise and chemically modify specific sequences in ribosomal RNA. Although miRNAs and snoRNAs are currently believed to be generated by different cellular pathways and to function in different cellular compartments, members of these two types of small RNAs display numerous genomic similarities, and a small number of snoRNAs have been shown to encode miRNAs in several organisms. Here we systematically investigate a possible evolutionary relationship between snoRNAs and miRNAs. Using computational analysis, we identify twenty genomic regions encoding miRNAs with highly significant similarity to snoRNAs, both on the level of their surrounding genomic context as well as their predicted folded structure. A subset of these miRNAs display functional snoRNA characteristics, strengthening the possibility that these miRNA molecules might have evolved from snoRNAs.
Asparagine-linked glycosylation is catalysed by oligosaccharyltransferase (OTase). In Trypanosoma brucei OTase activity is catalysed by single-subunit enzymes encoded by three paralogous genes of which TbSTT3B and TbSTT3C can complement a yeast Δstt3 mutant. The two enzymes have overlapping but distinct peptide acceptor specificities, with TbSTT3C displaying an enhanced ability to glycosylate sites flanked by acidic residues. TbSTT3A and TbSTT3B, but not TbSTT3C, are transcribed in the bloodstream and procyclic life cycle stages of T. brucei. Selective knockdown and analysis of parasite protein N-glycosylation showed that TbSTT3A selectively transfers biantennary Man5GlcNAc2 to specific glycosylation sites whereas TbSTT3B selectively transfers triantennary Man9GlcNAc2 to others. Analysis of T. brucei glycosylation site occupancy showed that TbSTT3A and TbSTT3B glycosylate sites in acidic to neutral and neutral to basic regions of polypeptide, respectively. This embodiment of distinct specificities in single-subunit OTases may have implications for recombinant glycoprotein engineering. TbSTT3A and TbSTT3B could be knocked down individually, but not collectively, in tissue culture. However, both were independently essential for parasite growth in mice, suggesting that inhibiting protein N-glycosylation could have therapeutic potential against trypanosomiasis.
glycosylation; oligosaccharyltransferase; STT3;
Sar2676, a pantothenate synthetase with a molecular weight of 31 419 Da from methicillin-resistant Staphylococcus aureus, has been expressed, purified and crystallized at 293 K.
Sar2676, a pantothenate synthetase with a molecular weight of 31 419 Da from methicillin-resistant Staphylococcus aureus, has been expressed, purified and crystallized at 293 K. The protein crystallizes in a primitive triclinic lattice, with unit-cell parameters a = 45.3, b = 60.5, c = 117.6 Å, α = 87.2, β = 81.2, γ = 68.4°. A complete data set has been collected to 2.3 Å resolution at the ESRF. Consideration of the likely solvent content suggested the asymmetric unit to contain four molecules. This has been confirmed by molecular-replacement phasing calculations, which give a solution with four monomers using a monomer of pantothenate synthetase from Escherichia coli (PDB code 1iho), which is 41% identical to Sar2676, as a search model.
Sar2676; pantothenate synthetase; methicillin-resistant Staphylococcus aureus
Summary: Jalview Version 2 is a system for interactive WYSIWYG editing, analysis and annotation of multiple sequence alignments. Core features include keyboard and mouse-based editing, multiple views and alignment overviews, and linked structure display with Jmol. Jalview 2 is available in two forms: a lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment, secondary structure prediction and the retrieval of alignments, sequences, annotation and structures from public databases and any DAS 1.53 compliant sequence or annotation server.
Availability: The Jalview 2 Desktop application and JalviewLite applet are made freely available under the GPL, and can be downloaded from www.jalview.org
The PIPs database (http://www.compbio.dundee.ac.uk/www-pips) is a resource for studying protein–protein interactions in human. It contains predictions of >37 000 high probability interactions of which >34 000 are not reported in the interaction databases HPRD, BIND, DIP or OPHID. The interactions in PIPs were calculated by a Bayesian method that combines information from expression, orthology, domain co-occurrence, post-translational modifications and sub-cellular location. The predictions also take account of the topology of the predicted interaction network. The web interface to PIPs ranks predictions according to their likelihood of interaction broken down by the contribution from each information source and with easy access to the evidence that supports each prediction. Where data exists in OPHID, HPRD, DIP or BIND for a protein pair this is also reported in the output tables returned by a search. A network browser is included to allow convenient browsing of the interaction network for any protein in the database. The PIPs database provides a new resource on protein–protein interactions in human that is straightforward to browse, or can be exploited completely, for interaction network modelling.
The regulation of protein function through reversible phosphorylation by protein kinases and phosphatases is a general mechanism controlling virtually every cellular activity. Eukaryotic protein kinases can be classified into distinct, well-characterized groups based on amino acid sequence similarity and function. We recently reported a highly sensitive and accurate hidden Markov model-based method for the automatic detection and classification of protein kinases into these specific groups. The Kinomer v. 1.0 database presented here contains annotated classifications for the protein kinase complements of 43 eukaryotic genomes. These span the taxonomic range and include fungi (16 species), plants (6), diatoms (1), amoebas (2), protists (1) and animals (17). The kinomes are stored in a relational database and are accessible through a web interface on the basis of species, kinase group or a combination of both. In addition, the Kinomer v. 1.0 HMM library is made available for users to perform classification on arbitrary sequences. The Kinomer v. 1.0 database is a continually updated resource where direct comparison of kinase sequences across kinase groups and across species can give insights into kinase function and evolution. Kinomer v. 1.0 is available at http://www.compbio.dundee.ac.uk/kinomer/.
SCANPS performs iterative profile searching similar to PSI-BLAST but with full dynamic programing on each cycle and on-the-fly estimation of significance. This combination gives good sensitivity and selectivity that outperforms PSI-BLAST in domain-searching benchmarks. Although computationally expensive, SCANPS exploits onchip parallelism (MMX and SSE2 instructions on Intel chips) as well as MPI parallelism to give acceptable turnround times even for large databases. A web server developed to run SCANPS searches is now available at http://www.compbio.dundee.ac.uk/www-scanps. The server interface allows a range of different protein sequence databases to be searched including the SCOP database of protein domains. The server provides the user with regularly updated versions of the main protein sequence databases and is backed up by significant computing resources which ensure that searches are performed rapidly. For SCOP searches, the results may be viewed in a new tree-based representation that reflects the structure of the SCOP hierarchy; this aids the user in placing each hit in the context of its SCOP classification and understanding its relationship to other domains in SCOP.
Jpred (http://www.compbio.dundee.ac.uk/jpred) is a secondary structure prediction server powered by the Jnet algorithm. Jpred performs over 1000 predictions per week for users in more than 50 countries. The recently updated Jnet algorithm provides a three-state (α-helix, β-strand and coil) prediction of secondary structure at an accuracy of 81.5%. Given either a single protein sequence or a multiple sequence alignment, Jpred derives alignment profiles from which predictions of secondary structure and solvent accessibility are made. The predictions are presented as coloured HTML, plain text, PostScript, PDF and via the Jalview alignment editor to allow flexibility in viewing and applying the data. The new Jpred 3 server includes significant usability improvements that include clearer feedback of the progress or failure of submitted requests. Functional improvements include batch submission of sequences, summary results via email and updates to the search databases. A new software pipeline will enable Jnet/Jpred to continue to be updated in sync with major updates to SCOP and UniProt and so ensures that Jpred 3 will maintain high-accuracy predictions.
TarO (http://www.compbio.dundee.ac.uk/taro) offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural biology techniques. The protein sequence is analysed by 17 algorithms and compared to 8 databases. TarO gathers putative homologues, including orthologues, and then obtains predictions of properties for these sequences including crystallisation propensity, protein disorder and post-translational modifications. Analyses are run on a high-performance computing cluster, the results integrated, stored in a database and accessed through a web-based user interface. Output is in tabulated format and in the form of an annotated multiple sequence alignment (MSA) that may be edited interactively in the program Jalview. TarO also simplifies the gathering of additional annotations via the Distributed Annotation System, both from the MSA in Jalview and through links to Dasty2. Routes to other information gateways are included, for example to relevant pages from UniProt, COG and the Conserved Domains Database. Open access to TarO is available from a guest account with private accounts for academic use available on request. Future development of TarO will include further analysis steps and integration with the Protein Information Management System (PIMS), a sister project in the BBSRC ‘Structural Proteomics of Rational Targets’ initiative