PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-10 (10)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Evolutionary Dynamics of Immune-Related Genes and Pathways in Disease-Vector Mosquitoes 
Science (New York, N.Y.)  2007;316(5832):1738-1743.
Mosquitoes are vectors of parasitic and viral diseases of immense importance for public health. The acquisition of the genome sequence of the yellow fever and Dengue vector, Aedes aegypti (Aa), has enabled a comparative phylogenomic analysis of the insect immune repertoire: in Aa, the malaria vector Anopheles gambiae (Ag), and the fruit fly Drosophila melanogaster (Dm). Analysis of immune signaling pathways and response modules reveals both conservative and rapidly evolving features associated with different functional gene categories and particular aspects of immune reactions. These dynamics reflect in part continuous readjustment between accommodation and rejection of pathogens and suggest how innate immunity may have evolved.
doi:10.1126/science.1139862
PMCID: PMC2042107  PMID: 17588928
2.  Transcriptome of the adult female malaria mosquito vector Anopheles albimanus 
BMC Genomics  2012;13:207.
Background
Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree transmit malaria, suggesting that their vectorial capacity has evolved independently. Anopheles albimanus (subgenus Nyssorhynchus) is an important malaria vector in the Americas. The divergence time between Anopheles gambiae, the main malaria vector in Africa, and the Neotropical vectors has been estimated to be 100 My. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to explore the mosquito biology beyond the An. gambiae complex.
Results
We sequenced the transcriptome of the An. albimanus adult female. By combining Sanger, 454 and Illumina sequences from cDNA libraries derived from the midgut, cuticular fat body, dorsal vessel, salivary gland and whole body, we generated a single, high-quality assembly containing 16,669 transcripts, 92% of which mapped to the An. darlingi genome and covered 90% of the core eukaryotic genome. Bidirectional comparisons between the An. gambiae, An. darlingi and An. albimanus predicted proteomes allowed the identification of 3,772 putative orthologs. More than half of the transcripts had a match to proteins in other insect vectors and had an InterPro annotation. We identified several protein families that may be relevant to the study of Plasmodium-mosquito interaction. An open source transcript annotation browser called GDAV (Genome-Delinked Annotation Viewer) was developed to facilitate public access to the data generated by this and future transcriptome projects.
Conclusions
We have explored the adult female transcriptome of one important New World malaria vector, An. albimanus. We identified protein-coding transcripts involved in biological processes that may be relevant to the Plasmodium lifecycle and can serve as the starting point for searching targets for novel control strategies. Our data increase the available genomic information regarding An. albimanus several hundred-fold, and will facilitate molecular research in medical entomology, evolutionary biology, genomics and proteomics of anopheline mosquito vectors. The data reported in this manuscript is accessible to the community via the VectorBase website (http://www.vectorbase.org/Other/AdditionalOrganisms/).
doi:10.1186/1471-2164-13-207
PMCID: PMC3442982  PMID: 22646700
Anopheles albimanus; Transcriptome; Malaria; RNA-Seq
3.  An expression map for Anopheles gambiae 
BMC Genomics  2011;12:620.
Background
Quantitative transcriptome data for the malaria-transmitting mosquito Anopheles gambiae covers a broad range of biological and experimental conditions, including development, blood feeding and infection. Web-based summaries of differential expression for individual genes with respect to these conditions are a useful tool for the biologist, but they lack the context that a visualisation of all genes with respect to all conditions would give. For most organisms, including A. gambiae, such a systems-level view of gene expression is not yet available.
Results
We have clustered microarray-based gene-averaged expression values, available from VectorBase, for 10194 genes over 93 experimental conditions using a self-organizing map. Map regions corresponding to known biological events, such as egg production, are revealed. Many individual gene clusters (nodes) on the map are highly enriched in biological and molecular functions, such as protein synthesis, protein degradation and DNA replication. Gene families, such as odorant binding proteins, can be classified into distinct functional groups based on their expression and evolutionary history. Immunity-related genes are non-randomly distributed in several distinct regions on the map, and are generally distant from genes with house-keeping roles. Each immunity-rich region appears to represent a distinct biological context for pathogen recognition and clearance (e.g. the humoral and gut epithelial responses). Several immunity gene families, such as peptidoglycan recognition proteins (PGRPs) and defensins, appear to be specialised for these distinct roles, while three genes with physically interacting protein products (LRIM1/APL1C/TEP1) are found in close proximity.
Conclusions
The map provides the first genome-scale, multi-experiment overview of gene expression in A. gambiae and should also be useful at the gene-level for investigating potential interactions. A web interface is available through the VectorBase website http://www.vectorbase.org/. It is regularly updated as new experimental data becomes available.
doi:10.1186/1471-2164-12-620
PMCID: PMC3341590  PMID: 22185628
4.  VectorBase: improvements to a bioinformatics resource for invertebrate vector genomics 
Nucleic Acids Research  2011;40(D1):D729-D734.
VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community.
doi:10.1093/nar/gkr1089
PMCID: PMC3245112  PMID: 22135296
5.  Analyses of cerebral microdialysis in patients with traumatic brain injury: relations to intracranial pressure, cerebral perfusion pressure and catheter placement 
BMC Medicine  2011;9:21.
Background
Cerebral microdialysis (MD) is used to monitor local brain chemistry of patients with traumatic brain injury (TBI). Despite an extensive literature on cerebral MD in the clinical setting, it remains unclear how individual levels of real-time MD data are to be interpreted. Intracranial pressure (ICP) and cerebral perfusion pressure (CPP) are important continuous brain monitors in neurointensive care. They are used as surrogate monitors of cerebral blood flow and have an established relation to outcome. The purpose of this study was to investigate the relations between MD parameters and ICP and/or CPP in patients with TBI.
Methods
Cerebral MD, ICP and CPP were monitored in 90 patients with TBI. Data were extensively analyzed, using over 7,350 samples of complete (hourly) MD data sets (glucose, lactate, pyruvate and glycerol) to seek representations of ICP, CPP and MD that were best correlated. MD catheter positions were located on computed tomography scans as pericontusional or nonpericontusional. MD markers were analyzed for correlations to ICP and CPP using time series regression analysis, mixed effects models and nonlinear (artificial neural networks) computer-based pattern recognition methods.
Results
Despite much data indicating highly perturbed metabolism, MD shows weak correlations to ICP and CPP. In contrast, the autocorrelation of MD is high for all markers, even at up to 30 future hours. Consequently, subject identity alone explains 52% to 75% of MD marker variance. This indicates that the dominant metabolic processes monitored with MD are long-term, spanning days or longer. In comparison, short-term (differenced or Δ) changes of MD vs. CPP are significantly correlated in pericontusional locations, but with less than 1% explained variance. Moreover, CPP and ICP were significantly related to outcome based on Glasgow Outcome Scale scores, while no significant relations were found between outcome and MD.
Conclusions
The multitude of highly perturbed local chemistry seen with MD in patients with TBI predominately represents long-term metabolic patterns and is weakly correlated to ICP and CPP. This suggests that disturbances other than pressure and/or flow have a dominant influence on MD levels in patients with TBI.
doi:10.1186/1741-7015-9-21
PMCID: PMC3056807  PMID: 21366904
6.  VectorBase: a data resource for invertebrate vector genomics 
Nucleic Acids Research  2008;37(Database issue):D583-D587.
VectorBase (http://www.vectorbase.org) is an NIAID-funded Bioinformatic Resource Center focused on invertebrate vectors of human pathogens. VectorBase annotates and curates vector genomes providing a web accessible integrated resource for the research community. Currently, VectorBase contains genome information for three mosquito species: Aedes aegypti, Anopheles gambiae and Culex quinquefasciatus, a body louse Pediculus humanus and a tick species Ixodes scapularis. Since our last report VectorBase has initiated a community annotation system, a microarray and gene expression repository and controlled vocabularies for anatomy and insecticide resistance. We have continued to develop both the software infrastructure and tools for interrogating the stored data.
doi:10.1093/nar/gkn857
PMCID: PMC2686483  PMID: 19028744
7.  The proteome: structure, function and evolution 
This paper reports two studies to model the inter-relationships between protein sequence, structure and function. First, an automated pipeline to provide a structural annotation of proteomes in the major genomes is described. The results are stored in a database at Imperial College, London (3D-GENOMICS) that can be accessed at www.sbg.bio.ic.ac.uk. Analysis of the assignments to structural superfamilies provides evolutionary insights. 3D-GENOMICS is being integrated with related proteome annotation data at University College London and the European Bioinformatics Institute in a project known as e-protein (http://www.e-protein.org/). The second topic is motivated by the developments in structural genomics projects in which the structure of a protein is determined prior to knowledge of its function. We have developed a new approach PHUNCTIONER that uses the gene ontology (GO) classification to supervise the extraction of the sequence signal responsible for protein function from a structure-based sequence alignment. Using GO we can obtain profiles for a range of specificities described in the ontology. In the region of low sequence similarity (around 15%), our method is more accurate than assignment from the closest structural homologue. The method is also able to identify the specific residues associated with the function of the protein family.
doi:10.1098/rstb.2005.1802
PMCID: PMC1609342  PMID: 16524832
bioinformatics; proteome annotation; protein function
8.  Improved alignment quality by combining evolutionary information, predicted secondary structure and self-organizing maps 
BMC Bioinformatics  2006;7:357.
Background
Protein sequence alignment is one of the basic tools in bioinformatics. Correct alignments are required for a range of tasks including the derivation of phylogenetic trees and protein structure prediction. Numerous studies have shown that the incorporation of predicted secondary structure information into alignment algorithms improves their performance. Secondary structure predictors have to be trained on a set of somewhat arbitrarily defined states (e.g. helix, strand, coil), and it has been shown that the choice of these states has some effect on alignment quality. However, it is not unlikely that prediction of other structural features also could provide an improvement. In this study we use an unsupervised clustering method, the self-organizing map, to assign sequence profile windows to "structural states" and assess their use in sequence alignment.
Results
The addition of self-organizing map locations as inputs to a profile-profile scoring function improves the alignment quality of distantly related proteins slightly. The improvement is slightly smaller than that gained from the inclusion of predicted secondary structure. However, the information seems to be complementary as the two prediction schemes can be combined to improve the alignment quality by a further small but significant amount.
Conclusion
It has been observed in many studies that predicted secondary structure significantly improves the alignments. Here we have shown that the addition of self-organizing map locations can further improve the alignments as the self-organizing map locations seem to contain some information that is not captured by the predicted secondary structure.
doi:10.1186/1471-2105-7-357
PMCID: PMC1562450  PMID: 16869963
9.  Automatic discovery of cross-family sequence features associated with protein function 
BMC Bioinformatics  2006;7:16.
Background
Methods for predicting protein function directly from amino acid sequences are useful tools in the study of uncharacterised protein families and in comparative genomics. Until now, this problem has been approached using machine learning techniques that attempt to predict membership, or otherwise, to predefined functional categories or subcellular locations. A potential drawback of this approach is that the human-designated functional classes may not accurately reflect the underlying biology, and consequently important sequence-to-function relationships may be missed.
Results
We show that a self-supervised data mining approach is able to find relationships between sequence features and functional annotations. No preconceived ideas about functional categories are required, and the training data is simply a set of protein sequences and their UniProt/Swiss-Prot annotations. The main technical aspect of the approach is the co-evolution of amino acid-based regular expressions and keyword-based logical expressions with genetic programming. Our experiments on a strictly non-redundant set of eukaryotic proteins reveal that the strongest and most easily detected sequence-to-function relationships are concerned with targeting to various cellular compartments, which is an area already well studied both experimentally and computationally. Of more interest are a number of broad functional roles which can also be correlated with sequence features. These include inhibition, biosynthesis, transcription and defence against bacteria. Despite substantial overlaps between these functions and their corresponding cellular compartments, we find clear differences in the sequence motifs used to predict some of these functions. For example, the presence of polyglutamine repeats appears to be linked more strongly to the "transcription" function than to the general "nuclear" function/location.
Conclusion
We have developed a novel and useful approach for knowledge discovery in annotated sequence data. The technique is able to identify functionally important sequence features and does not require expert knowledge. By viewing protein function from a sequence perspective, the approach is also suitable for discovering unexpected links between biological processes, such as the recently discovered role of ubiquitination in transcription.
doi:10.1186/1471-2105-7-16
PMCID: PMC1395344  PMID: 16409628
10.  3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes 
Nucleic Acids Research  2004;32(Database issue):D245-D250.
The 3D-GENOMICS database (http://www.sbg.bio.ic.ac.uk/3dgenomics/) provides structural annotations for proteins from sequenced genomes. In August 2003 the database included data for 93 proteomes. The annotations stored in the database include homologous sequences from various sequence databases, domains from SCOP and Pfam, patterns from Prosite and other predicted sequence features such as transmembrane regions and coiled coils. In addition to annotations at the sequence level, several precomputed cross- proteome comparative analyses are available based on SCOP domain superfamily composition. Annotations are available to the user via a web interface to the database. Multiple points of entry are available so that a user is able to: (i) directly access annotations for a single protein sequence via keywords or accession codes, (ii) examine a sequence of interest chosen from a summary of annotations for a particular proteome, or (iii) access precomputed frequency-based cross-proteome comparative analyses.
doi:10.1093/nar/gkh064
PMCID: PMC308798  PMID: 14681404

Results 1-10 (10)