Search tips
Search criteria

Results 1-12 (12)

Clipboard (0)
Year of Publication
Document Types
1.  Genome sequence of the human malaria parasite Plasmodium falciparum 
Nature  2002;419(6906):10.1038/nature01097.
The parasite Plasmodium falciparum is responsible for hundreds of millions of cases of malaria, and kills more than one million African children annually. Here we report an analysis of the genome sequence of P. falciparum clone 3D7. The 23-megabase nuclear genome consists of 14 chromosomes, encodes about 5,300 genes, and is the most (A + T)-rich genome sequenced to date. Genes involved in antigenic variation are concentrated in the subtelomeric regions of the chromosomes. Compared to the genomes of free-living eukaryotic microbes, the genome of this intracellular parasite encodes fewer enzymes and transporters, but a large proportion of genes are devoted to immune evasion and host–parasite interactions. Many nuclear-encoded proteins are targeted to the apicoplast, an organelle involved in fatty-acid and isoprenoid metabolism. The genome sequence provides the foundation for future studies of this organism, and is being exploited in the search for new drugs and vaccines to fight malaria.
PMCID: PMC3836256  PMID: 12368864
2.  Global Quantitative SILAC Phosphoproteomics Reveals Differential Phosphorylation Is Widespread between the Procyclic and Bloodstream Form Lifecycle Stages of Trypanosoma brucei 
Journal of Proteome Research  2013;12(5):2233-2244.
We report a global quantitative phosphoproteomic study of bloodstream and procyclic form Trypanosoma brucei using SILAC labeling of each lifecycle stage. Phosphopeptide enrichment by SCX and TiO2 led to the identification of a total of 10096 phosphorylation sites on 2551 protein groups and quantified the ratios of 8275 phosphorylation sites between the two lifecycle stages. More than 9300 of these sites (92%) have not previously been reported. Model-based gene enrichment analysis identified over representation of Gene Ontology terms relating to the flagella, protein kinase activity, and the regulation of gene expression. The quantitative data reveal that differential protein phosphorylation is widespread between bloodstream and procyclic form trypanosomes, with significant intraprotein differential phosphorylation. Despite a lack of dedicated tyrosine kinases, 234 phosphotyrosine residues were identified, and these were 3–4 fold over-represented among site changing >10-fold between the two lifecycle stages. A significant proportion of the T. brucei kinome was phosphorylated, with evidence that MAPK pathways are functional in both lifecycle stages. Regulation of gene expression in T. brucei is exclusively post-transcriptional, and the extensive phosphorylation of RNA binding proteins observed may be relevant to the control of mRNA stability in this organism.
PMCID: PMC3646404  PMID: 23485197
phosphorylation; SILAC; Trypanosoma brucei; quantitative proteomics; phosphoproteomics
3.  Tumorigenic fragments of APC cause dominant defects in directional cell migration in multiple model systems 
Disease Models & Mechanisms  2012;5(6):940-947.
Nonsense mutations that result in the expression of truncated, N-terminal, fragments of the adenomatous polyposis coli (APC) tumour suppressor protein are found in most sporadic and some hereditary colorectal cancers. These mutations can cause tumorigenesis by eliminating β-catenin-binding sites from APC, which leads to upregulation of β-catenin and thereby results in the induction of oncogenes such as MYC. Here we show that, in three distinct experimental model systems, expression of an N-terminal fragment of APC (N-APC) results in loss of directionality, but not speed, of cell motility independently of changes in β-catenin regulation. We developed a system to culture and fluorescently label live pieces of gut tissue to record high-resolution three-dimensional time-lapse movies of cells in situ. This revealed an unexpected complexity of normal gut cell migration, a key process in gut epithelial maintenance, with cells moving with spatial and temporal discontinuity. Quantitative comparison of gut tissue from wild-type mice and APC heterozygotes (APCMin/+; multiple intestinal neoplasia model) demonstrated that cells in precancerous epithelia lack directional preference when moving along the crypt-villus axis. This effect was reproduced in diverse experimental systems: in developing chicken embryos, mesoderm cells expressing N-APC failed to migrate normally; in amoeboid Dictyostelium, which lack endogenous APC, expressing an N-APC fragment maintained cell motility, but the cells failed to perform directional chemotaxis; and multicellular Dictyostelium slug aggregates similarly failed to perform phototaxis. We propose that N-terminal fragments of APC represent a gain-of-function mutation that causes cells within tissue to fail to migrate directionally in response to relevant guidance cues. Consistent with this idea, crypts in histologically normal tissues of APCMin/+ intestines are overpopulated with cells, suggesting that a lack of migration might cause cell accumulation in a precancerous state.
PMCID: PMC3484875  PMID: 22563063
4.  Prophossi: automating expert validation of phosphopeptide–spectrum matches from tandem mass spectrometry 
Bioinformatics  2010;26(17):2153-2159.
Motivation: Complex patterns of protein phosphorylation mediate many cellular processes. Tandem mass spectrometry (MS/MS) is a powerful tool for identifying these post-translational modifications. In high-throughput experiments, mass spectrometry database search engines, such as MASCOT provide a ranked list of peptide identifications based on hundreds of thousands of MS/MS spectra obtained in a mass spectrometry experiment. These search results are not in themselves sufficient for confident assignment of phosphorylation sites as identification of characteristic mass differences requires time-consuming manual assessment of the spectra by an experienced analyst. The time required for manual assessment has previously rendered high-throughput confident assignment of phosphorylation sites challenging.
Results: We have developed a knowledge base of criteria, which replicate expert assessment, allowing more than half of cases to be automatically validated and site assignments verified with a high degree of confidence. This was assessed by comparing automated spectral interpretation with careful manual examination of the assignments for 501 peptides above the 1% false discovery rate (FDR) threshold corresponding to 259 putative phosphorylation sites in 74 proteins of the Trypanosoma brucei proteome. Despite this stringent approach, we are able to validate 80 of the 91 phosphorylation sites (88%) positively identified by manual examination of the spectra used for the MASCOT searches with a FDR < 15%.
Conclusions:High-throughput computational analysis can provide a viable second stage validation of primary mass spectrometry database search results. Such validation gives rapid access to a systems level overview of protein phosphorylation in the experiment under investigation.
Availability: A GPL licensed software implementation in Perl for analysis and spectrum annotation is available in the supplementary material and a web server can be assessed online at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2922888  PMID: 20651112
5.  The Phosphoproteome of Bloodstream Form Trypanosoma brucei, Causative Agent of African Sleeping Sickness 
The protozoan parasite Trypanosoma brucei is the causative agent of human African sleeping sickness and related animal diseases, and it has over 170 predicted protein kinases. Protein phosphorylation is a key regulatory mechanism for cellular function that, thus far, has been studied in T.brucei principally through putative kinase mRNA knockdown and observation of the resulting phenotype. However, despite the relatively large kinome of this organism and the demonstrated essentiality of several T. brucei kinases, very few specific phosphorylation sites have been determined in this organism. Using a gel-free, phosphopeptide enrichment-based proteomics approach we performed the first large scale phosphorylation site analyses for T.brucei. Serine, threonine, and tyrosine phosphorylation sites were determined for a cytosolic protein fraction of the bloodstream form of the parasite, resulting in the identification of 491 phosphoproteins based on the identification of 852 unique phosphopeptides and 1204 phosphorylation sites. The phosphoproteins detected in this study are predicted from their genome annotations to participate in a wide variety of biological processes, including signal transduction, processing of DNA and RNA, protein synthesis, and degradation and to a minor extent in metabolic pathways. The analysis of phosphopeptides and phosphorylation sites was facilitated by in-house developed software, and this automated approach was validated by manual annotation of spectra of the kinase subset of proteins. Analysis of the cytosolic bloodstream form T. brucei kinome revealed the presence of 44 phosphorylated protein kinases in our data set that could be classified into the major eukaryotic protein kinase groups by applying a multilevel hidden Markov model library of the kinase catalytic domain. Identification of the kinase phosphorylation sites showed conserved phosphorylation sequence motifs in several kinase activation segments, supporting the view that phosphorylation-based signaling is a general and fundamental regulatory process that extends to this highly divergent lower eukaryote.
PMCID: PMC2716717  PMID: 19346560
6.  Jalview Version 2—a multiple sequence alignment editor and analysis workbench 
Bioinformatics  2009;25(9):1189-1191.
Summary: Jalview Version 2 is a system for interactive WYSIWYG editing, analysis and annotation of multiple sequence alignments. Core features include keyboard and mouse-based editing, multiple views and alignment overviews, and linked structure display with Jmol. Jalview 2 is available in two forms: a lightweight Java applet for use in web applications, and a powerful desktop application that employs web services for sequence alignment, secondary structure prediction and the retrieval of alignments, sequences, annotation and structures from public databases and any DAS 1.53 compliant sequence or annotation server.
Availability: The Jalview 2 Desktop application and JalviewLite applet are made freely available under the GPL, and can be downloaded from
PMCID: PMC2672624  PMID: 19151095
7.  Draft Genome of the Filarial Nematode Parasite Brugia malayi 
Science (New York, N.Y.)  2007;317(5845):1756-1760.
Parasitic nematodes that cause elephantiasis and river blindness threaten hundreds of millions of people in the developing world. We have sequenced the ~90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predict ~11,500 protein coding genes in 71 Mb of robustly assembled sequence. Comparative analysis with the free-living, model nematode Caenorhabditis elegans revealed that, despite these genes having maintained little conservation of local synteny during ~350 million years of evolution, they largely remain in linkage on chromosomal units. More than 100 conserved operons were identified. Analysis of the predicted proteome provides evidence for adaptations of B. malayi to niches in its human and vector hosts and insights into the molecular basis of a mutualistic relationship with its Wolbachia endosymbiont. These findings offer a foundation for rational drug design.
PMCID: PMC2613796  PMID: 17885136
8.  Kinomer v. 1.0: a database of systematically classified eukaryotic protein kinases 
Nucleic Acids Research  2008;37(Database issue):D244-D250.
The regulation of protein function through reversible phosphorylation by protein kinases and phosphatases is a general mechanism controlling virtually every cellular activity. Eukaryotic protein kinases can be classified into distinct, well-characterized groups based on amino acid sequence similarity and function. We recently reported a highly sensitive and accurate hidden Markov model-based method for the automatic detection and classification of protein kinases into these specific groups. The Kinomer v. 1.0 database presented here contains annotated classifications for the protein kinase complements of 43 eukaryotic genomes. These span the taxonomic range and include fungi (16 species), plants (6), diatoms (1), amoebas (2), protists (1) and animals (17). The kinomes are stored in a relational database and are accessible through a web interface on the basis of species, kinase group or a combination of both. In addition, the Kinomer v. 1.0 HMM library is made available for users to perform classification on arbitrary sequences. The Kinomer v. 1.0 database is a continually updated resource where direct comparison of kinase sequences across kinase groups and across species can give insights into kinase function and evolution. Kinomer v. 1.0 is available at
PMCID: PMC2686601  PMID: 18974176
9.  TarO: a target optimisation system for structural biology 
Nucleic Acids Research  2008;36(Web Server issue):W190-W196.
TarO ( offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural biology techniques. The protein sequence is analysed by 17 algorithms and compared to 8 databases. TarO gathers putative homologues, including orthologues, and then obtains predictions of properties for these sequences including crystallisation propensity, protein disorder and post-translational modifications. Analyses are run on a high-performance computing cluster, the results integrated, stored in a database and accessed through a web-based user interface. Output is in tabulated format and in the form of an annotated multiple sequence alignment (MSA) that may be edited interactively in the program Jalview. TarO also simplifies the gathering of additional annotations via the Distributed Annotation System, both from the MSA in Jalview and through links to Dasty2. Routes to other information gateways are included, for example to relevant pages from UniProt, COG and the Conserved Domains Database. Open access to TarO is available from a guest account with private accounts for academic use available on request. Future development of TarO will include further analysis steps and integration with the Protein Information Management System (PIMS), a sister project in the BBSRC ‘Structural Proteomics of Rational Targets’ initiative
PMCID: PMC2447720  PMID: 18385152
10.  A preliminary crystallographic analysis of the putative mevalonate diphosphate decarboxylase from Trypanosoma brucei  
The gene encoding the putative mevalonate diphosphate decarboxylase, an enzyme from the mevalonate pathway of isoprenoid precursor biosynthesis, has been cloned from T. brucei. Recombinant protein has been expressed, purified and highly ordered crystals obtained and characterized to aid the structure–function analysis of this enzyme.
Mevalonate diphosphate decarboxylase catalyses the last and least well characterized step in the mevalonate pathway for the biosynthesis of isopentenyl pyrophosphate, an isoprenoid precursor. A gene predicted to encode the enzyme from Trypanosoma brucei has been cloned, a highly efficient expression system established and a purification protocol determined. The enzyme gives monoclinic crystals in space group P21, with unit-cell parameters a = 51.5, b = 168.7, c = 54.9 Å, β = 118.8°. A Matthews coefficient V M of 2.5 Å3 Da−1 corresponds to two monomers, each approximately 42 kDa (385 residues), in the asymmetric unit with 50% solvent content. These crystals are well ordered and data to high resolution have been recorded using synchrotron radiation.
PMCID: PMC1952329  PMID: 16511101
decarboxylases; mevalonate biosynthesis; isoprenoids; Trypanosoma
11.  Identification of multiple distinct Snf2 subfamilies with conserved structural motifs 
Nucleic Acids Research  2006;34(10):2887-2905.
The Snf2 family of helicase-related proteins includes the catalytic subunits of ATP-dependent chromatin remodelling complexes found in all eukaryotes. These act to regulate the structure and dynamic properties of chromatin and so influence a broad range of nuclear processes. We have exploited progress in genome sequencing to assemble a comprehensive catalogue of over 1300 Snf2 family members. Multiple sequence alignment of the helicase-related regions enables 24 distinct subfamilies to be identified, a considerable expansion over earlier surveys. Where information is known, there is a good correlation between biological or biochemical function and these assignments, suggesting Snf2 family motor domains are tuned for specific tasks. Scanning of complete genomes reveals all eukaryotes contain members of multiple subfamilies, whereas they are less common and not ubiquitous in eubacteria or archaea. The large sample of Snf2 proteins enables additional distinguishing conserved sequence blocks within the helicase-like motor to be identified. The establishment of a phylogeny for Snf2 proteins provides an opportunity to make informed assignments of function, and the identification of conserved motifs provides a framework for understanding the mechanisms by which these proteins function.
PMCID: PMC1474054  PMID: 16738128
12.  ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins 
Nucleic Acids Research  2003;31(13):3625-3630.
Multidomain proteins predominate in eukaryotic proteomes. Individual functions assigned to different sequence segments combine to create a complex function for the whole protein. While on-line resources are available for revealing globular domains in sequences, there has hitherto been no comprehensive collection of small functional sites/motifs comparable to the globular domain resources, yet these are as important for the function of multidomain proteins. Short linear peptide motifs are used for cell compartment targeting, protein–protein interaction, regulation by phosphorylation, acetylation, glycosylation and a host of other post-translational modifications. ELM, the Eukaryotic Linear Motif server at, is a new bioinformatics resource for investigating candidate short non-globular functional motifs in eukaryotic proteins, aiming to fill the void in bioinformatics tools. Sequence comparisons with short motifs are difficult to evaluate because the usual significance assessments are inappropriate. Therefore the server is implemented with several logical filters to eliminate false positives. Current filters are for cell compartment, globular domain clash and taxonomic range. In favourable cases, the filters can reduce the number of retained matches by an order of magnitude or more.
PMCID: PMC168952  PMID: 12824381

Results 1-12 (12)