Search tips
Search criteria

Results 1-18 (18)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  A Cell-surface Phylome for African Trypanosomes 
The cell surface of Trypanosoma brucei, like many protistan blood parasites, is crucial for mediating host-parasite interactions and is instrumental to the initiation, maintenance and severity of infection. Previous comparisons with the related trypanosomatid parasites T. cruzi and Leishmania major suggest that the cell-surface proteome of T. brucei is largely taxon-specific. Here we compare genes predicted to encode cell surface proteins of T. brucei with those from two related African trypanosomes, T. congolense and T. vivax. We created a cell surface phylome (CSP) by estimating phylogenies for 79 gene families with putative surface functions to understand the more recent evolution of African trypanosome surface architecture. Our findings demonstrate that the transferrin receptor genes essential for bloodstream survival in T. brucei are conserved in T. congolense but absent from T. vivax and include an expanded gene family of insect stage-specific surface glycoproteins that includes many currently uncharacterized genes. We also identify species-specific features and innovations and confirm that these include most expression site-associated genes (ESAGs) in T. brucei, which are absent from T. congolense and T. vivax. The CSP presents the first global picture of the origins and dynamics of cell surface architecture in African trypanosomes, representing the principal differences in genomic repertoire between African trypanosome species and provides a basis from which to explore the developmental and pathological differences in surface architectures. All data can be accessed at:
Author Summary
The African trypanosome (Trypanosoma brucei) is a single-celled, vector-borne parasite that causes Human African Trypanosomiasis (or ‘sleeping sickness’) throughout sub-Saharan Africa and, along with related species T. congolense and T. vivax, a similar disease in wild and domestic animals. Together, the African trypanosomes have significant effects on human and animal health and associated costs for socio-economic development in Africa. Genes expressed on the trypanosome cell surface are instrumental in causing disease and sustaining infection by resisting the host immune system. Here we compare repertoires of genes with predicted cell-surface expression in T. brucei, T. congolense and T. vivax and estimate the phylogeny of each predicted cell-surface gene family. This ‘cell-surface phylome’ (CSP) provides a detailed analysis of species-specific gene families and of gene gain and loss in shared families, aiding the identification of surface proteins that may mediate specific aspects of pathogenesis and disease progression. Overall, the CSP suggests that each trypanosome species has modified its surface proteome uniquely, indicating that T. brucei, T. congolense and T. vivax have subtly distinct mechanisms for interacting with both vertebrate and insect hosts.
PMCID: PMC3605285  PMID: 23556014
2.  GeneDB—an annotation database for pathogens 
Nucleic Acids Research  2011;40(D1):D98-D108.
GeneDB ( is a genome database for prokaryotic and eukaryotic pathogens and closely related organisms. The resource provides a portal to genome sequence and annotation data, which is primarily generated by the Pathogen Genomics group at the Wellcome Trust Sanger Institute. It combines data from completed and ongoing genome projects with curated annotation, which is readily accessible from a web based resource. The development of the database in recent years has focused on providing database-driven annotation tools and pipelines, as well as catering for increasingly frequent assembly updates. The website has been significantly redesigned to take advantage of current web technologies, and improve usability. The current release stores 41 data sets, of which 17 are manually curated and maintained by biologists, who review and incorporate data from the scientific literature, as well as other sources. GeneDB is primarily a production and annotation database for the genomes of predominantly pathogenic organisms.
PMCID: PMC3245030  PMID: 22116062
3.  Genomic-scale prioritization of drug targets: the TDR Targets database 
Nature reviews. Drug discovery  2008;7(11):900-907.
The increasing availability of genomic data for pathogens that cause tropical diseases has created new opportunities for drug discovery and development. However, if the potential of such data is to be fully exploited, the data must be effectively integrated and be easy to interrogate. Here, we discuss the development of the database (, which encompasses extensive genetic, biochemical and pharmacological data related to tropical disease pathogens, as well as computationally predicted druggability for potential targets and compound desirability information. By allowing the integration and weighting of this information, this database aims to facilitate the identification and prioritisation of candidate drug targets for pathogens.
PMCID: PMC3184002  PMID: 18927591
4.  TSIDER1, a short and non-autonomous Salivarian trypanosome-specific retroposon related to the ingi6 subclade 
Graphical abstract
Position of African trypanosome-specific TSIDER1 retroposons in the ingi and trypanosomatid phylogenetic trees.
► A new retroposon family of the ingi clade, named TSIDER1, was identified. ► TSIDER1 are short degenerate retroposons. ► TSIDER1 is only present in the nuclear genome of African trypanosomes. ► In contrast to Leishmania, African trypanosomes have not expanded and domesticated SIDER.
Retroposons of the ingi clade are the most abundant transposable elements identified in the trypanosomatid genomes. Some are long autonomous elements (ingi, L1Tc) while others, such as RIME and NARTc, are short non-coding elements that parasitize the retrotransposition machinery of the active autonomous ones for their own mobilization. Here, we identified a new family of short non-autonomous retroposons of the ingi clade, called TSIDER1, which are present in the genome of Salivarian (African) trypanosomes, Trypanosoma brucei, T. congolense and T. vivax, but absent in the T. cruzi and Leishmania spp. genomes and, as such, TSIDER1 is the only retroposon subfamily conserved at the nucleotide level between African trypanosome species. We identified three TvSIDER1 families within the genome of T. vivax and the high level of sequence conservation within the TvSIDER1a and TvSIDER1b groups suggests that they are still active. We propose that TvSIDER1a/b elements are using the Tvingi retrotransposition machinery, as they are preceded by the same conserved pattern characteristic of the ingi6 subclade, which corresponds to the retroposon-encoded endonuclease binding site. In contrast, TcoSIDER1, TbSIDER1 and TvSIDER1c are too divergent to be considered as active retroposons. The relatively low number of SIDER elements identified in the T. congolense (70 copies), T. vivax (32 copies) and T. brucei (22 copies) genomes confirms that trypanosomes have not expanded short transposable elements, which is in contrast to Leishmania spp. (∼2000 copies), where SIDER play a role in the regulation of gene expression.
PMCID: PMC3820030  PMID: 21664383
SIDER, Short Interspersed DEgenerate Retroposons; DIRE, Degenerate Ingi/L1Tc-Related Element; African trypanosomes; Ingi; Retroposon; Non-LTR retrotransposon; Non-autonomous; SIDER
5.  Differential protein expression throughout the life cycle of Trypanosoma congolense, a major parasite of cattle in Africa 
Graphical abstract
► All 4 major life cycle stages of Trypanosoma congolense were grown in the lab. ► Relative protein expression among the life cycle stages was studied by iTRAQ MS. ► Several known expression trends were observed and new patterns were identified. ► Special focus is given to lifecycle stage specific surface molecules. ► Six new proteins unique to T. congolense were discovered.
Trypanosoma congolense is an important pathogen of livestock in Africa. To study protein expression throughout the T. congolense life cycle, we used culture-derived parasites of each of the three main insect stages and bloodstream stage parasites isolated from infected mice, to perform differential protein expression analysis. Three complete biological replicates of all four life cycle stages were produced from T. congolense IL3000, a cloned parasite that is amenable to culture of major life cycle stages in vitro. Cellular proteins from each life cycle stage were trypsin digested and the resulting peptides were labeled with isobaric tags for relative and absolute quantification (iTRAQ). The peptides were then analyzed by tandem mass spectrometry (MS/MS). This method was used to identify and relatively quantify proteins from the different life cycle stages in the same experiment. A search of the Wellcome Trust's Sanger Institute's semi-annotated T. congolense database was performed using the MS/MS fragmentation data to identify the corresponding source proteins. A total of 2088 unique protein sequences were identified, representing 23% of the ∼9000 proteins predicted for the T. congolense proteome. The 1291 most confidently identified proteins were prioritized for further study. Of these, 784 yielded annotated hits while 501 were described as “hypothetical proteins”. Six proteins showed no significant sequence similarity to any known proteins (from any species) and thus represent new, previously uncharacterized T. congolense proteins. Of particular interest among the remainder are several membrane molecules that showed drastic differential expression, including, not surprisingly, the well-studied variant surface glycoproteins (VSGs), invariant surface glycoproteins (ISGs) 65 and 75, congolense epimastigote specific protein (CESP), the surface protease GP63, an amino acid transporter, a pteridine transporter and a haptoglobin–hemoglobin receptor. Several of these surface disposed proteins are of functional interest as they are necessary for survival of the parasites.
PMCID: PMC3820035  PMID: 21354217
iTRAQ, isobaric tags for relative and absolute quantitation; PF, procyclic form; PCF, procyclic culture form; EMF, epimastigote form; MCF, metacyclic form; BSF, bloodstream form; VSG, variant surface glycoprotein; CESP, congolense epimastigote-specific protein; ORF, open reading frame; Trypanosoma congolense; Life cycle; Proteomics; Protein expression
6.  Analysis of expressed sequence tags from the four main developmental stages of Trypanosoma congolense 
Trypanosoma congolense is one of the most economically important pathogens of livestock in Africa. Culture-derived parasites of each of the three main insect stages of the T. congolense life cycle, i.e., the procyclic, epimastigote and metacyclic stages, and bloodstream stage parasites isolated from infected mice, were used to construct stage-specific cDNA libraries and expressed sequence tags (ESTs or cDNA clones) in each library were sequenced. Thirteen EST clusters encoding different variant surface glycoproteins (VSGs) were detected in the metacyclic library and twenty-six VSG EST clusters were found in the bloodstream library, six of which are shared by the metacyclic library. Rare VSG ESTs are present in the epimastigote library, and none were detected in the procyclic library. ESTs encoding enzymes that catalyze oxidative phosphorylation and amino acid metabolism are about twice as abundant in the procyclic and epimastigote stages as in the metacyclic and bloodstream stages. In contrast, ESTs encoding enzymes involved in glycolysis, the citric acid cycle and nucleotide metabolism are about the same in all four developmental stages. Cysteine proteases, kinases and phosphatases are the most abundant enzyme groups represented by the ESTs. All four libraries contain T. congolense-specific expressed sequences not present in the T. brucei and T. cruzi genomes. Normalized cDNA libraries were constructed from the metacyclic and bloodstream stages, and found to be further enriched for T. congolense-specific ESTs. Given that cultured T. congolense offers an experimental advantage over other African trypanosome species, these ESTs provide a basis for further investigation of the molecular properties of these four developmental stages, especially the epimastigote and metacyclic stages for which it is difficult to obtain large quantities of organisms. The T. congolense EST databases are available at:
PMCID: PMC2741298  PMID: 19559733
metacyclic; epimastigote; procyclic; bloodstream; variant surface glycoprotein; cysteine protease; hypothetical proteins
7.  Identification of Attractive Drug Targets in Neglected-Disease Pathogens Using an In Silico Approach 
The increased sequencing of pathogen genomes and the subsequent availability of genome-scale functional datasets are expected to guide the experimental work necessary for target-based drug discovery. However, a major bottleneck in this has been the difficulty of capturing and integrating relevant information in an easily accessible format for identifying and prioritizing potential targets. The open-access resource facilitates drug target prioritization for major tropical disease pathogens such as the mycobacteria Mycobacterium leprae and Mycobacterium tuberculosis; the kinetoplastid protozoans Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi; the apicomplexan protozoans Plasmodium falciparum, Plasmodium vivax, and Toxoplasma gondii; and the helminths Brugia malayi and Schistosoma mansoni.
Methodology/Principal Findings
Here we present strategies to prioritize pathogen proteins based on whether their properties meet criteria considered desirable in a drug target. These criteria are based upon both sequence-derived information (e.g., molecular mass) and functional data on expression, essentiality, phenotypes, metabolic pathways, assayability, and druggability. This approach also highlights the fact that data for many relevant criteria are lacking in less-studied pathogens (e.g., helminths), and we demonstrate how this can be partially overcome by mapping data from homologous genes in well-studied organisms. We also show how individual users can easily upload external datasets and integrate them with existing data in to generate highly customized ranked lists of potential targets.
Using the datasets and the tools available in, we have generated illustrative lists of potential drug targets in seven tropical disease pathogens. While these lists are broadly consistent with the research community's current interest in certain specific proteins, and suggest novel target candidates that may merit further study, the lists can easily be modified in a user-specific manner, either by adjusting the weights for chosen criteria or by changing the criteria that are included.
Author Summary
In cell-based drug development, researchers attempt to create drugs that kill a pathogen without necessarily understanding the details of how the drugs work. In contrast, target-based drug development entails the search for compounds that act on a specific intracellular target—often a protein known or suspected to be required for survival of the pathogen. The latter approach to drug development has been facilitated greatly by the sequencing of many pathogen genomes and the incorporation of genome data into user-friendly databases. The present paper shows how the database can identify proteins that might be considered good drug targets for diseases such as African sleeping sickness, Chagas disease, parasitic worm infections, tuberculosis, and malaria. These proteins may score highly in searches of the database because they are dissimilar to human proteins, are structurally similar to other “druggable” proteins, have functions that are easy to measure, and/or fulfill other criteria. Researchers can use the lists of high-scoring proteins as a basis for deciding which potential drug targets to pursue experimentally.
PMCID: PMC2927427  PMID: 20808766
8.  Multiple Genetic Mechanisms Lead to Loss of Functional TbAT1 Expression in Drug-Resistant Trypanosomes ▿  
Eukaryotic Cell  2010;9(2):336-343.
The P2 aminopurine transporter, encoded by TbAT1 in African trypanosomes in the Trypanosoma brucei group, carries melaminophenyl arsenical and diamidine drugs into these parasites. Loss of this transporter contributes to drug resistance. We identified the genomic location of TbAT1 to be in the subtelomeric region of chromosome 5 and determined the status of the TbAT1 gene in two trypanosome lines selected for resistance to the melaminophenyl arsenical, melarsamine hydrochloride (Cymelarsan), and in a Trypanosoma equiperdum clone selected for resistance to the diamidine, diminazene aceturate. In the Trypanosoma brucei gambiense STIB 386 melarsamine hydrochloride-resistant line, TbAT1 is deleted, while in the Trypanosoma brucei brucei STIB 247 melarsamine hydrochloride-resistant and T. equiperdum diminazene-resistant lines, TbAT1 is present, but expression at the RNA level is no longer detectable. Further characterization of TbAT1 in T. equiperdum revealed that a loss of heterozygosity at the TbAT1 locus accompanied loss of expression and that P2-mediated uptake of [3H]diminazene is lost in drug-resistant T. equiperdum. Adenine-inhibitable adenosine uptake is still detectable in a ΔTbat1 T. b. brucei mutant, although at a greatly reduced capacity compared to that of the wild type, indicating that an additional adenine-inhibitable adenosine permease, distinct from P2, is present in these cells.
PMCID: PMC2823006  PMID: 19966032
9.  The Genome Sequence of Trypanosoma brucei gambiense, Causative Agent of Chronic Human African Trypanosomiasis 
Trypanosoma brucei gambiense is the causative agent of chronic Human African Trypanosomiasis or sleeping sickness, a disease endemic across often poor and rural areas of Western and Central Africa. We have previously published the genome sequence of a T. b. brucei isolate, and have now employed a comparative genomics approach to understand the scale of genomic variation between T. b. gambiense and the reference genome. We sought to identify features that were uniquely associated with T. b. gambiense and its ability to infect humans.
Methods and Findings
An improved high-quality draft genome sequence for the group 1 T. b. gambiense DAL 972 isolate was produced using a whole-genome shotgun strategy. Comparison with T. b. brucei showed that sequence identity averages 99.2% in coding regions, and gene order is largely collinear. However, variation associated with segmental duplications and tandem gene arrays suggests some reduction of functional repertoire in T. b. gambiense DAL 972. A comparison of the variant surface glycoproteins (VSG) in T. b. brucei with all T. b. gambiense sequence reads showed that the essential structural repertoire of VSG domains is conserved across T. brucei.
This study provides the first estimate of intraspecific genomic variation within T. brucei, and so has important consequences for future population genomics studies. We have shown that the T. b. gambiense genome corresponds closely with the reference, which should therefore be an effective scaffold for any T. brucei genome sequence data. As VSG repertoire is also well conserved, it may be feasible to describe the total diversity of variant antigens. While we describe several as yet uncharacterized gene families with predicted cell surface roles that were expanded in number in T. b. brucei, no T. b. gambiense-specific gene was identified outside of the subtelomeres that could explain the ability to infect humans.
Author Summary
Sleeping sickness, or Human African Trypanosomiasis, is a disease affecting the health and productivity of poor people in many rural areas of sub-Saharan Africa. The disease is caused by a single-celled flagellate, Trypanosoma brucei, which evades the immune system by periodically switching the proteins on its surface. We have produced a genome sequence for T. brucei gambiense, which is the particular subspecies causing most disease in humans. We compared this with an existing reference genome for a non-human infecting strain (T. b. brucei 927) to identify genes in T. b. gambiense that might explain its ability to infect humans and to assess how well the reference performs as a universal plan for all T. brucei. The genome sequences differ only due to rare insertions and duplications and homologous genes are over 95% identical on average. The archive of surface antigens that enable the parasite to switch its protein coat is remarkably consistent, even though it evolves very quickly. We identified genes with predicted cell surface functions that are only present in T. b. brucei and have evolved rapidly in recent time. These genes might help to explain variation in disease pathology between different T. brucei strains in different hosts.
PMCID: PMC2854126  PMID: 20404998
10.  Trypanosomatid Genomes Contain Several Subfamilies of ingi-Related Retroposons ▿ †  
Eukaryotic Cell  2009;8(10):1532-1542.
Retroposons are ubiquitous transposable elements found in the genomes of most eukaryotes, including trypanosomatids. The African and American trypanosomes (Trypanosoma brucei and Trypanosoma cruzi) contain long autonomous retroposons of the ingi clade (Tbingi and L1Tc, respectively) and short nonautonomous truncated versions (TbRIME and NARTc, respectively), as well as degenerate ingi-related retroposons devoid of coding capacity (DIREs). In contrast, Leishmania major contains only remnants of extinct retroposons (LmDIREs) and of short nonautonomous heterogeneous elements (LmSIDERs). We extend this comparative and evolutionary analysis of retroposons to the genomes of two other African trypanosomes (Trypanosoma congolense and Trypanosoma vivax) and another Leishmania sp. (Leishmania braziliensis). Three new potentially functional retroposons of the ingi clade have been identified: Tvingi in T. vivax and Tcoingi and L1Tco in T. congolense. T. congolense is the first trypanosomatid containing two classes of potentially active retroposons of the ingi clade. We analyzed sequences located upstream of these new long autonomous ingi-related elements, which code for the recognition site of the retroposon-encoded endonuclease. The closely related Tcoingi and Tvingi elements show the same conserved pattern, indicating that the Tcoingi- and Tvingi-encoded endonucleases share site specificity. Similarly, the conserved pattern previously identified upstream of L1Tc has also been detected at the same relative position upstream of L1Tco elements. A phylogenetic analysis of all ingi-related retroposons identified so far, including DIREs, clearly shows that several distinct subfamilies have emerged and coexisted, though in the course of trypanosomatid evolution, only a few have been maintained as active elements in modern trypanosomatid (sub)species.
PMCID: PMC2756862  PMID: 19666780
11.  The genome of the blood fluke Schistosoma mansoni 
Nature  2009;460(7253):352-358.
Schistosoma mansoni is responsible for the neglected tropical disease schistosomiasis that affects 210 million people in 76 countries. We report here analysis of the 363 megabase nuclear genome of the blood fluke. It encodes at least 11,809 genes, with an unusual intron size distribution, and novel families of micro-exon genes that undergo frequent alternate splicing. As the first sequenced flatworm, and a representative of the lophotrochozoa, it offers insights into early events in the evolution of the animals, including the development of a body pattern with bilateral symmetry, and the development of tissues into organs. Our analysis has been informed by the need to find new drug targets. The deficits in lipid metabolism that make schistosomes dependent on the host are revealed, while the identification of membrane receptors, ion channels and more than 300 proteases, provide new insights into the biology of the life cycle and novel targets. Bioinformatics approaches have identified metabolic chokepoints while a chemogenomic screen has pinpointed schistosome proteins for which existing drugs may be active. The information generated provides an invaluable resource for the research community to develop much needed new control tools for the treatment and eradication of this important and neglected disease.
PMCID: PMC2756445  PMID: 19606141
12.  TriTrypDB: a functional genomic resource for the Trypanosomatidae 
Nucleic Acids Research  2009;38(Database issue):D457-D462.
TriTrypDB ( is an integrated database providing access to genome-scale datasets for kinetoplastid parasites, and supporting a variety of complex queries driven by research and development needs. TriTrypDB is a collaborative project, utilizing the GUS/WDK computational infrastructure developed by the Eukaryotic Pathogen Bioinformatics Resource Center ( to integrate genome annotation and analyses from GeneDB and elsewhere with a wide variety of functional genomics datasets made available by members of the global research community, often pre-publication. Currently, TriTrypDB integrates datasets from Leishmania braziliensis, L. infantum, L. major, L. tarentolae, Trypanosoma brucei and T. cruzi. Users may examine individual genes or chromosomal spans in their genomic context, including syntenic alignments with other kinetoplastid organisms. Data within TriTrypDB can be interrogated utilizing a sophisticated search strategy system that enables a user to construct complex queries combining multiple data types. All search strategies are stored, allowing future access and integrated searches. ‘User Comments’ may be added to any gene page, enhancing available annotation; such comments become immediately searchable via the text search, and are forwarded to curators for incorporation into the reference annotation when appropriate.
PMCID: PMC2808979  PMID: 19843604
13.  Discovery of Mating in the Major African Livestock Pathogen Trypanosoma congolense 
PLoS ONE  2009;4(5):e5564.
The protozoan parasite, Trypanosoma congolense, is one of the most economically important pathogens of livestock in Africa and, through its impact on cattle health and productivity, has a significant effect on human health and well being. Despite the importance of this parasite our knowledge of some of the fundamental biological processes is limited. For example, it is unknown whether mating takes place. In this paper we have taken a population genetics based approach to address this question. The availability of genome sequence of the parasite allowed us to identify polymorphic microsatellite markers, which were used to genotype T. congolense isolates from livestock in a discrete geographical area of The Gambia. The data showed a high level of diversity with a large number of distinct genotypes, but a deficit in heterozygotes. Further analysis identified cryptic genetic subdivision into four sub-populations. In one of these, parasite genotypic diversity could only be explained by the occurrence of frequent mating in T. congolense. These data are completely inconsistent with previous suggestions that the parasite expands asexually in the absence of mating. The discovery of mating in this species of trypanosome has significant consequences for the spread of critical traits, such as drug resistance, as well as for fundamental aspects of the biology and epidemiology of this neglected but economically important pathogen.
PMCID: PMC2679202  PMID: 19440370
14.  Telomeric Expression Sites Are Highly Conserved in Trypanosoma brucei 
PLoS ONE  2008;3(10):e3527.
Subtelomeric regions are often under-represented in genome sequences of eukaryotes. One of the best known examples of the use of telomere proximity for adaptive purposes are the bloodstream expression sites (BESs) of the African trypanosome Trypanosoma brucei. To enhance our understanding of BES structure and function in host adaptation and immune evasion, the BES repertoire from the Lister 427 strain of T. brucei were independently tagged and sequenced. BESs are polymorphic in size and structure but reveal a surprisingly conserved architecture in the context of extensive recombination. Very small BESs do exist and many functioning BESs do not contain the full complement of expression site associated genes (ESAGs). The consequences of duplicated or missing ESAGs, including ESAG9, a newly named ESAG12, and additional variant surface glycoprotein genes (VSGs) were evaluated by functional assays after BESs were tagged with a drug-resistance gene. Phylogenetic analysis of constituent ESAG families suggests that BESs are sequence mosaics and that extensive recombination has shaped the evolution of the BES repertoire. This work opens important perspectives in understanding the molecular mechanisms of antigenic variation, a widely used strategy for immune evasion in pathogens, and telomere biology.
PMCID: PMC2567434  PMID: 18953401
15.  The minimum information about a genome sequence (MIGS) specification 
Nature biotechnology  2008;26(5):541-547.
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases.
PMCID: PMC2409278  PMID: 18464787
16.  The Genome of the Kinetoplastid Parasite, Leishmania major 
Ivens, Alasdair C. | Peacock, Christopher S. | Worthey, Elizabeth A. | Murphy, Lee | Aggarwal, Gautam | Berriman, Matthew | Sisk, Ellen | Rajandream, Marie-Adele | Adlem, Ellen | Aert, Rita | Anupama, Atashi | Apostolou, Zina | Attipoe, Philip | Bason, Nathalie | Bauser, Christopher | Beck, Alfred | Beverley, Stephen M. | Bianchettin, Gabriella | Borzym, Katja | Bothe, Gordana | Bruschi, Carlo V. | Collins, Matt | Cadag, Eithon | Ciarloni, Laura | Clayton, Christine | Coulson, Richard M. R. | Cronin, Ann | Cruz, Angela K. | Davies, Robert M. | Gaudenzi, Javier De | Dobson, Deborah E. | Duesterhoeft, Andreas | Fazelina, Gholam | Fosker, Nigel | Frasch, Alberto Carlos | Fraser, Audrey | Fuchs, Monika | Gabel, Claudia | Goble, Arlette | Goffeau, André | Harris, David | Hertz-Fowler, Christiane | Hilbert, Helmut | Horn, David | Huang, Yiting | Klages, Sven | Knights, Andrew | Kube, Michael | Larke, Natasha | Litvin, Lyudmila | Lord, Angela | Louie, Tin | Marra, Marco | Masuy, David | Matthews, Keith | Michaeli, Shulamit | Mottram, Jeremy C. | Müller-Auer, Silke | Munden, Heather | Nelson, Siri | Norbertczak, Halina | Oliver, Karen | O'Neil, Susan | Pentony, Martin | Pohl, Thomas M. | Price, Claire | Purnelle, Bénédicte | Quail, Michael A. | Rabbinowitsch, Ester | Reinhardt, Richard | Rieger, Michael | Rinta, Joel | Robben, Johan | Robertson, Laura | Ruiz, Jeronimo C. | Rutter, Simon | Saunders, David | Schäfer, Melanie | Schein, Jacquie | Schwartz, David C. | Seeger, Kathy | Seyler, Amber | Sharp, Sarah | Shin, Heesun | Sivam, Dhileep | Squares, Rob | Squares, Steve | Tosato, Valentina | Vogt, Christy | Volckaert, Guido | Wambutt, Rolf | Warren, Tim | Wedler, Holger | Woodward, John | Zhou, Shiguo | Zimmermann, Wolfgang | Smith, Deborah F. | Blackwell, Jenefer M. | Stuart, Kenneth D. | Barrell, Bart | Myler, Peter J.
Science (New York, N.Y.)  2005;309(5733):436-442.
PMCID: PMC1470643  PMID: 16020728
17.  GeneDB: a resource for prokaryotic and eukaryotic organisms 
Nucleic Acids Research  2004;32(Database issue):D339-D343.
GeneDB ( is a genome database for prokaryotic and eukaryotic organisms. The resource provides a portal through which data generated by the Pathogen Sequencing Unit at the Wellcome Trust Sanger Institute and other collaborating sequencing centres can be made publicly available. It combines data from finished and ongoing genome and expressed sequence tag (EST) projects with curated annotation, that can be searched, sorted and downloaded, using a single web based resource. The current release stores 11 datasets of which six are curated and maintained by biologists, who review and incorporate information from the scientific literature, public databases and the respective research communities.
PMCID: PMC308742  PMID: 14681429
18.  The DNA sequence of chromosome I of an African trypanosome: gene content, chromosome organisation, recombination and polymorphism 
Nucleic Acids Research  2003;31(16):4864-4873.
The African trypanosome, Trypanosoma brucei, causes sleeping sickness in humans in sub-Saharan Africa. Here we report the sequence and analysis of the 1.1 Mb chromosome I, which encodes approximately 400 predicted genes organised into directional clusters, of which more than 100 are located in the largest cluster of 250 kb. A 160-kb region consists primarily of three gene families of unknown function, one of which contains a hotspot for retroelement insertion. We also identify five novel gene families. Indeed, almost 20% of predicted genes are members of families. In some cases, tandemly arrayed genes are 99–100% identical, suggesting an active process of amplification and gene conversion. One end of the chromosome consists of a putative bloodstream-form variant surface glycoprotein (VSG) gene expression site that appears truncated and degenerate. The other chromosome end carries VSG and expression site-associated genes and pseudogenes over 50 kb of subtelomeric sequence where, unusually, the telomere-proximal VSG gene is oriented away from the telomere. Our analysis includes the cataloguing of minor genetic variations between the chromosome I homologues and an estimate of crossing-over frequency during genetic exchange. Genetic polymorphisms are exceptionally rare in sequences located within and around the strand-switches between several gene clusters.
PMCID: PMC169939  PMID: 12907729

Results 1-18 (18)