PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (62)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
more »
1.  Effects of Warm Ischemic Time on Gene Expression Profiling in Colorectal Cancer Tissues and Normal Mucosa 
PLoS ONE  2013;8(1):e53406.
Background
Genome-wide gene expression analyses of tumors are a powerful tool to identify gene signatures associated with biologically and clinically relevant characteristics and for several tumor types are under clinical validation by prospective trials. However, handling and processing of clinical specimens may significantly affect the molecular data obtained from their analysis. We studied the effects of tissue handling time on gene expression in human normal and tumor colon tissues undergoing routine surgical procedures.
Methods
RNA extracted from specimens of 15 patients at four time points (for a total of 180 samples) after surgery was analyzed for gene expression on high-density oligonucleotide microarrays. A mixed-effects model was used to identify probes with different expression means across the four different time points. The p-values of the model were adjusted with the Bonferroni method.
Results
Thirty-two probe sets associated with tissue handling time in the tumor specimens, and thirty-one in the normal tissues, were identified. Most genes exhibited moderate changes in expression over the time points analyzed; however four of them were oncogenes, and two confirmed the effect of tissue handling by independent validation.
Conclusions
Our results suggest that a critical time point for tissue handling in colon seems to be 60 minutes at room temperature. Although the number of time-dependent genes we identified was low, the three genes that already showed changes at this time point in tumor samples were all oncogenes, hence recommending standardization of tissue-handling protocols and effort to reduce the time from specimen removal to snap freezing accounting for warm ischemia in this tumor type.
doi:10.1371/journal.pone.0053406
PMCID: PMC3538764  PMID: 23308215
2.  T-Cell Receptors Binding Orientation over Peptide/MHC Class I Is Driven by Long-Range Interactions 
PLoS ONE  2012;7(12):e51943.
Crystallographic data about T-Cell Receptor – peptide – major histocompatibility complex class I (TCRpMHC) interaction have revealed extremely diverse TCR binding modes triggering antigen recognition. Understanding the molecular basis that governs TCR orientation over pMHC is still a considerable challenge. We present a simplified rigid approach applied on all non-redundant TCRpMHC crystal structures available. The CHARMM force field in combination with the FACTS implicit solvation model is used to study the role of long-distance interactions between the TCR and pMHC. We demonstrate that the sum of the coulomb interactions and the electrostatic solvation energies is sufficient to identify two orientations corresponding to energetic minima at 0° and 180° from the native orientation. Interestingly, these results are shown to be robust upon small structural variations of the TCR such as changes induced by Molecular Dynamics simulations, suggesting that shape complementarity is not required to obtain a reliable signal. Accurate energy minima are also identified by confronting unbound TCR crystal structures to pMHC. Furthermore, we decompose the electrostatic energy into residue contributions to estimate their role in the overall orientation. Results show that most of the driving force leading to the formation of the complex is defined by CDR1,2/MHC interactions. This long-distance contribution appears to be independent from the binding process itself, since it is reliably identified without considering neither short-range energy terms nor CDR induced fit upon binding. Ultimately, we present an attempt to predict the TCR/pMHC binding mode for a TCR structure obtained by homology modeling. The simplicity of the approach and the absence of any fitted parameters make it also easily applicable to other types of macromolecular protein complexes.
doi:10.1371/journal.pone.0051943
PMCID: PMC3522592  PMID: 23251658
3.  Helminth secretome database (HSD): a collection of helminth excretory/secretory proteins predicted from expressed sequence tags (ESTs) 
BMC Genomics  2012;13(Suppl 7):S8.
Background
Helminths are important socio-economic organisms, responsible for causing major parasitic infections in humans, other animals and plants. These infections impose a significant public health and economic burden globally. Exceptionally, some helminth organisms like Caenorhabditis elegans are free-living in nature and serve as model organisms for studying parasitic infections. Excretory/secretory proteins play an important role in parasitic helminth infections which make these proteins attractive targets for therapeutic use. In the case of helminths, large volume of expressed sequence tags (ESTs) has been generated to understand parasitism at molecular level and for predicting excretory/secretory proteins for developing novel strategies to tackle parasitic infections. However, mostly predicted ES proteins are not available for further analysis and there is no repository available for such predicted ES proteins. Furthermore, predictions have, in the main, focussed on classical secretory pathways while it is well established that helminth parasites also utilise non-classical secretory pathways.
Results
We developed a free Helminth Secretome Database (HSD), which serves as a repository for ES proteins predicted using classical and non-classical secretory pathways, from EST data for 78 helminth species (64 nematodes, 7 trematodes and 7 cestodes) ranging from parasitic to free-living organisms. Approximately 0.9 million ESTs compiled from the largest EST database, dbEST were cleaned, assembled and analysed by different computational tools in our bioinformatics pipeline and predicted ES proteins were submitted to HSD.
Conclusion
We report the large-scale prediction and analysis of classically and non-classically secreted ES proteins from diverse helminth organisms. All the Unigenes (contigs and singletons) and excretory/secretory protein datasets generated from this analysis are freely available. A BLAST server is available at http://estexplorer.biolinfo.org/hsd, for checking the sequence similarity of new protein sequences against predicted helminth ES proteins.
doi:10.1186/1471-2164-13-S7-S8
PMCID: PMC3546426  PMID: 23281827
4.  TranSeqAnnotator: large-scale analysis of transcriptomic data 
BMC Bioinformatics  2012;13(Suppl 17):S24.
Background
The transcriptome of an organism can be studied with the analysis of expressed sequence tag (EST) data sets that offers a rapid and cost effective approach with several new and updated bioinformatics approaches and tools for assembly and annotation. The comprehensive analyses comprehend an organism along with the genome and proteome analysis. With the advent of large-scale sequencing projects and generation of sequence data at protein and cDNA levels, automated analysis pipeline is necessary to store, organize and annotate ESTs.
Results
TranSeqAnnotator is a workflow for large-scale analysis of transcriptomic data with the most appropriate bioinformatics tools for data management and analysis. The pipeline automatically cleans, clusters, assembles and generates consensus sequences, conceptually translates these into possible protein products and assigns putative function based on various DNA and protein similarity searches. Excretory/secretory (ES) proteins inferred from ESTs/short reads are also identified. The TranSeqAnnotator accepts FASTA format raw and quality ESTs along with protein and short read sequences and are analysed with user selected programs. After pre-processing and assembly, the dataset is annotated at the nucleotide, protein and ES protein levels.
Conclusion
TranSeqAnnotator has been developed in a Linux cluster, to perform an exhaustive and reliable analysis and provide detailed annotation. TranSeqAnnotator outputs gene ontologies, protein functional identifications in terms of mapping to protein domains and metabolic pathways. The pipeline is applied to annotate large EST datasets to identify several novel and known genes with therapeutic experimental validations and could serve as potential targets for parasite intervention. TransSeqAnnotator is freely available for the scientific community at http://estexplorer.biolinfo.org/TranSeqAnnotator/.
doi:10.1186/1471-2105-13-S17-S24
PMCID: PMC3521237  PMID: 23282024
5.  An analysis of the transcriptome of Teladorsagia circumcincta: its biological and biotechnological implications 
BMC Genomics  2012;13(Suppl 7):S10.
Background
Teladorsagia circumcincta (order Strongylida) is an economically important parasitic nematode of small ruminants (including sheep and goats) in temperate climatic regions of the world. Improved insights into the molecular biology of this parasite could underpin alternative methods required to control this and related parasites, in order to circumvent major problems associated with anthelmintic resistance. The aims of the present study were to define the transcriptome of the adult stage of T. circumcincta and to infer the main pathways linked to molecules known to be expressed in this nematode. Since sheep develop acquired immunity against T. circumcincta, there is some potential for the development of a vaccine against this parasite. Hence, we infer excretory/secretory molecules for T. circumcincta as possible immunogens and vaccine candidates.
Results
A total of 407,357 ESTs were assembled yielding 39,852 putative gene sequences. Conceptual translation predicted 24,013 proteins, which were then subjected to detailed annotation which included pathway mapping of predicted proteins (including 112 excreted/secreted [ES] and 226 transmembrane peptides), domain analysis and GO annotation was carried out using InterProScan along with BLAST2GO. Further analysis was carried out for secretory signal peptides using SignalP and non-classical sec pathway using SecretomeP tools.
For ES proteins, key pathways, including Fc epsilon RI, T cell receptor, and chemokine signalling as well as leukocyte transendothelial migration were inferred to be linked to immune responses, along with other pathways related to neurodegenerative diseases and infectious diseases, which warrant detailed future studies. KAAS could identify new and updated pathways like phagosome and protein processing in endoplasmic reticulum. Domain analysis for the assembled dataset revealed families of serine, cysteine and proteinase inhibitors which might represent targets for parasite intervention. InterProScan could identify GO terms pertaining to the extracellular region. Some of the important domain families identified included the SCP-like extracellular proteins which belong to the pathogenesis-related proteins (PRPs) superfamily along with C-type lectin, saposin-like proteins. The 'extracellular region' that corresponds to allergen V5/Tpx-1 related, considered important in parasite-host interactions, was also identified.
Six cysteine motif (SXC1) proteins, transthyretin proteins, C-type lectins, activation-associated secreted proteins (ASPs), which could represent potential candidates for developing novel anthelmintics or vaccines were few other important findings. Of these, SXC1, protein kinase domain-containing protein, trypsin family protein, trypsin-like protease family member (TRY-1), putative major allergen and putative lipid binding protein were identified which have not been reported in the published T. circumcincta proteomics analysis.
Detailed analysis of 6,058 raw EST sequences from dbEST revealed 315 putatively secreted proteins. Amongst them, C-type single domain activation associated secreted protein ASP3 precursor, activation-associated secreted proteins (ASP-like protein), cathepsin B-like cysteine protease, cathepsin L cysteine protease, cysteine protease, TransThyretin-Related and Venom-Allergen-like proteins were the key findings.
Conclusions
We have annotated a large dataset ESTs of T. circumcincta and undertaken detailed comparative bioinformatics analyses. The results provide a comprehensive insight into the molecular biology of this parasite and disease manifestation which provides potential focal point for future research. We identified a number of pathways responsible for immune response. This type of large-scale computational scanning could be coupled with proteomic and metabolomic studies of this parasite leading to novel therapeutic intervention and disease control strategies. We have also successfully affirmed the use of bioinformatics tools, for the study of ESTs, which could now serve as a benchmark for the development of new computational EST analysis pipelines.
doi:10.1186/1471-2164-13-S7-S10
PMCID: PMC3521389  PMID: 23282110
6.  Automated Analysis and Reannotation of Subcellular Locations in Confocal Images from the Human Protein Atlas 
PLoS ONE  2012;7(11):e50514.
The Human Protein Atlas contains immunofluorescence images showing subcellular locations for thousands of proteins. These are currently annotated by visual inspection. In this paper, we describe automated approaches to analyze the images and their use to improve annotation. We began by training classifiers to recognize the annotated patterns. By ranking proteins according to the confidence of the classifier, we generated a list of proteins that were strong candidates for reexamination. In parallel, we applied hierarchical clustering to group proteins and identified proteins whose annotations were inconsistent with the remainder of the proteins in their cluster. These proteins were reexamined by the original annotators, and a significant fraction had their annotations changed. The results demonstrate that automated approaches can provide an important complement to visual annotation.
doi:10.1371/journal.pone.0050514
PMCID: PMC3511558  PMID: 23226299
7.  A Complex Set of Sex Pheromones Identified in the Cuttlefish Sepia officinalis 
PLoS ONE  2012;7(10):e46531.
Background
The cephalopod mollusk Sepia officinalis can be considered as a relevant model for studying reproduction strategies associated to seasonal migrations. Using transcriptomic and peptidomic approaches, we aim to identify peptide sex pheromones that are thought to induce the aggregation of mature cuttlefish in their egg-laying areas.
Results
To facilitate the identification of sex pheromones, 576 5′-expressed sequence tags (ESTs) were sequenced from a single cDNA library generated from accessory sex glands of female cuttlefish. Our analysis yielded 223 unique sequences composed of 186 singletons and 37 contigs. Three major redundant ESTs called SPα, SPα′ and SPβ were identified as good candidates for putative sex pheromone transcripts and are part of the 87 unique sequences classified as unknown. The alignment of translated SPα and SPα′ revealed a high level of conservation, with 98.4% identity. Translation led to a 248-amino acid precursor containing six peptides with multiple putative disulfide bonds. The alignment of SPα-α′ with SPβ revealed a partial structural conservation, with 37.3% identity. Translation of SPβ led to a 252-amino acid precursor containing five peptides. The occurrence of a signal peptide on SPα, SPα′ and SPβ showed that the peptides were secreted. RT-PCR and mass spectrometry analyses revealed a co-localization of transcripts and expression products in the oviduct gland. Preliminary in vitro experiments performed on gills and penises revealed target organs involved in mating and ventilation.
Conclusions
The analysis of the accessory sex gland transcriptome of Sepia officinalis led to the identification of peptidic sex pheromones. Although preliminary functional tests suggested the involvement of the α3 and β2 peptides in ventilation and mating stimulation, further functional investigations will make it possible to identify the complete set of biological activities expected from waterborne pheromones.
doi:10.1371/journal.pone.0046531
PMCID: PMC3484142  PMID: 23118854
8.  Annotation of the M. tuberculosis Hypothetical Orfeome: Adding Functional Information to More than Half of the Uncharacterized Proteins 
PLoS ONE  2012;7(4):e34302.
The genome of Mycobacterium tuberculosis (H37Rv) contains 4,019 protein coding genes, of which more than thousand have been categorized as ‘hypothetical’ implying that for these not even weak functional associations could be identified so far. We here predict reliable functional indications for half of this large hypothetical orfeome: 497 genes can be annotated based on orthology, and another 125 can be linked to interacting proteins via integrated genomic context analysis and literature mining. The assignments include newly identified clusters of interacting proteins, hypothetical genes that are associated to well known pathways and putative disease-relevant targets. All together, we have raised the fraction of the proteome with at least some functional annotation to 88% which should considerably enhance the interpretation of large-scale experiments targeting this medically important organism.
doi:10.1371/journal.pone.0034302
PMCID: PMC3317503  PMID: 22485162
9.  The Transcriptome Analysis of Strongyloides stercoralis L3i Larvae Reveals Targets for Intervention in a Neglected Disease 
Background
Strongyloidiasis is one of the most neglected diseases distributed worldwide with endemic areas in developed countries, where chronic infections are life threatening. Despite its impact, very little is known about the molecular biology of the parasite involved and its interplay with its hosts. Next generation sequencing technologies now provide unique opportunities to rapidly address these questions.
Principal Findings
Here we present the first transcriptome of the third larval stage of S. stercoralis using 454 sequencing coupled with semi-automated bioinformatic analyses. 253,266 raw sequence reads were assembled into 11,250 contiguous sequences, most of which were novel. 8037 putative proteins were characterized based on homology, gene ontology and/or biochemical pathways. Comparison of the transcriptome of S. strongyloides with those of other nematodes, including S. ratti, revealed similarities in transcription of molecules inferred to have key roles in parasite-host interactions. Enzymatic proteins, like kinases and proteases, were abundant. 1213 putative excretory/secretory proteins were compiled using a new pipeline which included non-classical secretory proteins. Potential drug targets were also identified.
Conclusions
Overall, the present dataset should provide a solid foundation for future fundamental genomic, proteomic and metabolomic explorations of S. stercoralis, as well as a basis for applied outcomes, such as the development of novel methods of intervention against this neglected parasite.
Author Summary
Strongyloides stercoralis (Nematoda) is an important parasite of humans, causing Strongyloidiasis, considered as one of the most neglected diseases, affecting more than 100 million people worldwide. Chronic infections in endemic areas can be maintained for decades through the autoinfective cycle with the L3 filariform larvae. In these areas, misdiagnosis, inadequate treatment and the facilitation of hyperinfection syndrome by immunosupression are frequent and contribute to a high mortality rate. Among the affected areas, chronic patients have been described in the Valencian Mediterranean coastal region of Spain. Despite its serious impact, very little is known about this parasite and its relationship with its hosts at the molecular level, and more effective diagnostic tests and treatments are needed. Next generation sequencing technologies now provide unique opportunities to rapidly advance in these areas. In this study, we present the first transcriptome of S. stercoralis L3i using 454 sequencing followed by semi-automated bioinformatic analyses. Our study identifies 8037 putative proteins based on homology, gene ontology, and/or biochemical pathways, including putative excretory/secretory proteins as well as potential drug targets. The present dataset provides a useful resource and adds greatly to our understanding of a human parasite affecting both developed and developing countries.
doi:10.1371/journal.pntd.0001513
PMCID: PMC3289599  PMID: 22389732
10.  Long-Term Survival of Hydrated Resting Eggs from Brachionus plicatilis 
PLoS ONE  2012;7(1):e29365.
Background
Several organisms display dormancy and developmental arrest at embryonic stages. Long-term survival in the dormant form is usually associated with desiccation, orthodox plant seeds and Artemia cysts being well documented examples. Several aquatic invertebrates display dormancy during embryonic development and survive for tens or even hundreds of years in a hydrated form, raising the question of whether survival in the non-desiccated form of embryonic development depends on pathways similar to those occurring in desiccation tolerant forms.
Methodology/Principal Findings
To address this question, Illumina short read sequencing was used to generate transcription profiles from the resting and amictic eggs of an aquatic invertebrate, the rotifer, Brachionus plicatilis. These two types of egg have very different life histories, with the dormant or diapausing resting eggs, the result of the sexual cycle and amictic eggs, the non-dormant products of the asexual cycle. Significant transcriptional differences were found between the two types of egg, with amictic eggs rich in genes involved in the morphological development into a juvenile rotifer. In contrast, representatives of classical “stress” proteins: a small heat shock protein, ferritin and Late Embryogenesis Abundant (LEA) proteins were identified in resting eggs. More importantly however, was the identification of transcripts for messenger ribonucleoprotein particles which stabilise RNA. These inhibit translation and provide a valuable source of useful RNAs which can be rapidly activated on the exit from dormancy. Apoptotic genes were also present. Although apoptosis is inconsistent with maintenance of prolonged dormancy, an altered apoptotic pathway has been proposed for Artemia, and this may be the case with the rotifer.
Conclusions
These data represent the first transcriptional profiling of molecular processes associated with dormancy in a non-desiccated form and indicate important similarities in the molecular pathways activated in resting eggs compared with desiccated dormant forms, specifically plant seeds and Artemia.
doi:10.1371/journal.pone.0029365
PMCID: PMC3253786  PMID: 22253713
11.  Homogeneous Datasets of Triple Negative Breast Cancers Enable the Identification of Novel Prognostic and Predictive Signatures 
PLoS ONE  2011;6(12):e28403.
Background
Current prognostic gene signatures for breast cancer mainly reflect proliferation status and have limited value in triple-negative (TNBC) cancers. The identification of prognostic signatures from TNBC cohorts was limited in the past due to small sample sizes.
Methodology/Principal Findings
We assembled all currently publically available TNBC gene expression datasets generated on Affymetrix gene chips. Inter-laboratory variation was minimized by filtering methods for both samples and genes. Supervised analysis was performed to identify prognostic signatures from 394 cases which were subsequently tested on an independent validation cohort (n = 261 cases).
Conclusions/Significance
Using two distinct false discovery rate thresholds, 25% and <3.5%, a larger (n = 264 probesets) and a smaller (n = 26 probesets) prognostic gene sets were identified and used as prognostic predictors. Most of these genes were positively associated with poor prognosis and correlated to metagenes for inflammation and angiogenesis. No correlation to other previously published prognostic signatures (recurrence score, genomic grade index, 70-gene signature, wound response signature, 7-gene immune response module, stroma derived prognostic predictor, and a medullary like signature) was observed. In multivariate analyses in the validation cohort the two signatures showed hazard ratios of 4.03 (95% confidence interval [CI] 1.71–9.48; P = 0.001) and 4.08 (95% CI 1.79–9.28; P = 0.001), respectively. The 10-year event-free survival was 70% for the good risk and 20% for the high risk group. The 26-gene signatures had modest predictive value (AUC = 0.588) to predict response to neoadjuvant chemotherapy, however, the combination of a B-cell metagene with the prognostic signatures increased its response predictive value. We identified a 264-gene prognostic signature for TNBC which is unrelated to previously known prognostic signatures.
doi:10.1371/journal.pone.0028403
PMCID: PMC3248403  PMID: 22220191
12.  Towards big data science in the decade ahead from ten years of InCoB and the 1st ISCB-Asia Joint Conference 
BMC Bioinformatics  2011;12(Suppl 13):S1.
The 2011 International Conference on Bioinformatics (InCoB) conference, which is the annual scientific conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted by Kuala Lumpur, Malaysia, is co-organized with the first ISCB-Asia conference of the International Society for Computational Biology (ISCB). InCoB and the sequencing of the human genome are both celebrating their tenth anniversaries and InCoB’s goalposts for the next decade, implementing standards in bioinformatics and globally distributed computational networks, will be discussed and adopted at this conference. Of the 49 manuscripts (selected from 104 submissions) accepted to BMC Genomics and BMC Bioinformatics conference supplements, 24 are featured in this issue, covering software tools, genome/proteome analysis, systems biology (networks, pathways, bioimaging) and drug discovery and design.
doi:10.1186/1471-2105-12-S13-S1
PMCID: PMC3278825  PMID: 22372736
13.  In silico approach to screen compounds active against parasitic nematodes of major socio-economic importance 
BMC Bioinformatics  2011;12(Suppl 13):S25.
Background
Infections due to parasitic nematodes are common causes of morbidity and fatality around the world especially in developing nations. At present however, there are only three major classes of drugs for treating human nematode infections. Additionally the scientific knowledge on the mechanism of action and the reason for the resistance to these drugs is poorly understood. Commercial incentives to design drugs that are endemic to developing countries are limited therefore, virtual screening in academic settings can play a vital role is discovering novel drugs useful against neglected diseases. In this study we propose to build robust machine learning model to classify and screen compounds active against parasitic nematodes.
Results
A set of compounds active against parasitic nematodes were collated from various literature sources including PubChem while the inactive set was derived from DrugBank database. The support vector machine (SVM) algorithm was used for model development, and stratified ten-fold cross validation was used to evaluate the performance of each classifier. The best results were obtained using the radial basis function kernel. The SVM method achieved an accuracy of 81.79% on an independent test set. Using the model developed above, we were able to indentify novel compounds with potential anthelmintic activity.
Conclusion
In this study, we successfully present the SVM approach for predicting compounds active against parasitic nematodes which suggests the effectiveness of computational approaches for antiparasitic drug discovery. Although, the accuracy obtained is lower than the previously reported in a similar study but we believe that our model is more robust because we intentionally employed stringent criteria to select inactive dataset thus making it difficult for the model to classify compounds. The method presents an alternative approach to the existing traditional methods and may be useful for predicting hitherto novel anthelmintic compounds.
doi:10.1186/1471-2105-12-S13-S25
PMCID: PMC3278842  PMID: 22373185
14.  InCoB celebrates its tenth anniversary as first joint conference with ISCB-Asia 
BMC Genomics  2011;12(Suppl 3):S1.
In 2009 the International Society for Computational Biology (ISCB) started to roll out regional bioinformatics conferences in Africa, Latin America and Asia. The open and competitive bid for the first meeting in Asia (ISCB-Asia) was awarded to Asia-Pacific Bioinformatics Network (APBioNet) which has been running the International Conference on Bioinformatics (InCoB) in the Asia-Pacific region since 2002. InCoB/ISCB-Asia 2011 is held from November 30 to December 2, 2011 in Kuala Lumpur, Malaysia. Of 104 manuscripts submitted to BMC Genomics and BMC Bioinformatics conference supplements, 49 (47.1%) were accepted. The strong showing of Asia among submissions (82.7%) and acceptances (81.6%) signals the success of this tenth InCoB anniversary meeting, and bodes well for the future of ISCB-Asia.
doi:10.1186/1471-2164-12-S3-S1
PMCID: PMC3333168  PMID: 22369160
15.  In silico secretome analysis approach for next generation sequencing transcriptomic data 
BMC Genomics  2011;12(Suppl 3):S14.
Background
Excretory/secretory proteins (ESPs) play a major role in parasitic infection as they are present at the host-parasite interface and regulate host immune system. In case of parasitic helminths, transcriptomics has been used extensively to understand the molecular basis of parasitism and for developing novel therapeutic strategies against parasitic infections. However, none of transcriptomic studies have extensively covered ES protein prediction for identifying novel therapeutic targets, especially as parasites adopt non-classical secretion pathways.
Results
We developed a semi-automated computational approach for prediction and annotation of ES proteins using transcriptomic data from next generation sequencing platforms. For the prediction of non-classically secreted proteins, we have used an improved computational strategy, together with homology matching to a dataset of experimentally determined parasitic helminth ES proteins. We applied this protocol to analyse 454 short reads of parasitic nematode, Strongyloides ratti. From 296231 reads, we derived 28901 contigs, which were translated into 20877 proteins. Based on our improved ES protein prediction pipeline, we identified 2572 ES proteins, of which 407 (1.9%) proteins have classical N-terminal signal peptides, 923 (4.4%) were computationally identified as non-classically secreted while 1516 (7.26%) were identified by homology to experimentally identified parasitic helminth ES proteins. Out of 2572 ES proteins, 2310 (89.8%) ES proteins had homologues in the free-living nematode Caenorhabditis elegans and 2220 (86.3%) in parasitic nematodes. We could functionally annotate 1591 (61.8%) ES proteins with protein families and domains and establish pathway associations for 691 (26.8%) proteins. In addition, we have identified 19 representative ES proteins, which have no homologues in the host organism but homologous to lethal RNAi phenotypes in C. elegans, as potential therapeutic targets.
Conclusion
We report a comprehensive approach using freely available computational tools for the secretome analysis of NGS data. This approach has been applied to S. ratti 454 transcriptomic data for in silico excretory/secretory proteins prediction and analysis, providing a foundation for developing new therapeutic solutions for parasitic infections.
doi:10.1186/1471-2164-12-S3-S14
PMCID: PMC3333173  PMID: 22369360
16.  A comparative structural bioinformatics analysis of inherited mutations in β-D-Mannosidase across multiple species reveals a genotype-phenotype correlation  
BMC Genomics  2011;12(Suppl 3):S22.
Background
Lysosomal β-D-mannosidase is a glycosyl hydrolase that breaks down the glycosidic bonds at the non-reducing end of N-linked glycoproteins. Hence, it is a crucial enzyme in polysaccharide degradation pathway. Mutations in the MANBA gene that codes for lysosomal β-mannosidase, result in improper coding and malfunctioning of protein, leading to β-mannosidosis. Studying the location of mutations on the enzyme structure is a rational approach in order to understand the functional consequences of these mutations. Accordingly, the pathology and clinical manifestations of the disease could be correlated to the genotypic modifications.
Results
The wild-type and inherited mutations of β-mannosidase were studied across four different species, human, cow, goat and mouse employing a previously demonstrated comprehensive homology modeling and mutational mapping technique, which reveals a correlation between the variation of genotype and the severity of phenotype in β-mannosidosis. X-ray crystallographic structure of β-mannosidase from Bacteroides thetaiotaomicron was used as template for 3D structural modeling of the wild-type enzymes containing all the associated ligands. These wild-type models subsequently served as templates for building mutational structures. Truncations account for approximately 70% of the mutational cases. In general, the proximity of mutations to the active site determines the severity of phenotypic expressions. Mapping mutations to the MANBA gene sequence has identified five mutational hot-spots.
Conclusion
Although restrained by a limited dataset, our comprehensive study suggests a genotype-phenotype correlation in β-mannosidosis. A predictive approach for detecting likely β-mannosidosis is also demonstrated where we have extrapolated observed mutations from one species to homologous positions in other organisms based on the proximity of the mutations to the enzyme active site and their co-location from different organisms. Apart from aiding the detection of mutational hotspots in the gene, where novel mutations could be disease-implicated, this approach also provides a way to predict new disease mutations. Higher expression of the exoglycosidase chitobiase is said to play a vital role in determining disease phenotypes in human and mouse. A bigger dataset of inherited mutations as well as a parallel study of β-mannosidase and chitobiase activities in prospective patients would be interesting to better understand the underlying reasons for β-mannosidosis.
doi:10.1186/1471-2164-12-S3-S22
PMCID: PMC3333182  PMID: 22369051
17.  Structural diversity of biologically interesting datasets: a scaffold analysis approach 
Background
The recent public availability of the human metabolome and natural product datasets has revitalized "metabolite-likeness" and "natural product-likeness" as a drug design concept to design lead libraries targeting specific pathways. Many reports have analyzed the physicochemical property space of biologically important datasets, with only a few comprehensively characterizing the scaffold diversity in public datasets of biological interest. With large collections of high quality public data currently available, we carried out a comparative analysis of current day leads with other biologically relevant datasets.
Results
In this study, we note a two-fold enrichment of metabolite scaffolds in drug dataset (42%) as compared to currently used lead libraries (23%). We also note that only a small percentage (5%) of natural product scaffolds space is shared by the lead dataset. We have identified specific scaffolds that are present in metabolites and natural products, with close counterparts in the drugs, but are missing in the lead dataset. To determine the distribution of compounds in physicochemical property space we analyzed the molecular polar surface area, the molecular solubility, the number of rings and the number of rotatable bonds in addition to four well-known Lipinski properties. Here, we note that, with only few exceptions, most of the drugs follow Lipinski's rule. The average values of the molecular polar surface area and the molecular solubility in metabolites is the highest while the number of rings is the lowest. In addition, we note that natural products contain the maximum number of rings and the rotatable bonds than any other dataset under consideration.
Conclusions
Currently used lead libraries make little use of the metabolites and natural products scaffold space. We believe that metabolites and natural products are recognized by at least one protein in the biosphere therefore, sampling the fragment and scaffold space of these compounds, along with the knowledge of distribution in physicochemical property space, can result in better lead libraries. Hence, we recommend the greater use of metabolites and natural products while designing lead libraries. Nevertheless, metabolites have a limited distribution in chemical space that limits the usage of metabolites in library design.
doi:10.1186/1758-2946-3-30
PMCID: PMC3179739  PMID: 21824432
18.  Understanding TR Binding to pMHC Complexes: How Does a TR Scan Many pMHC Complexes yet Preferentially Bind to One 
PLoS ONE  2011;6(2):e17194.
Understanding the basis of the binding of a T cell receptor (TR) to the peptide-MHC (pMHC) complex is essential due to the vital role it plays in adaptive immune response. We describe the use of computed binding (free) energy (BE), TR paratope, pMHC epitope, molecular surface electrostatic potential (MSEP) and calculated TR docking angle (θ) to analyse 61 TR/pMHC crystallographic structures to comprehend TR/pMHC interaction. In doing so, we have successfully demonstrated a novel/rational approach for θ calculation, obtained a linear correlation between BE and θ without any “codon” or amino acid preference, provided an explanation for TR ability to scan many pMHC ligands yet specifically bind one, proposed a mechanism for pMHC recognition by TR leading to T cell activation and illustrated the importance of the peptide in determining TR specificity, challenging the “germline bias” theory.
doi:10.1371/journal.pone.0017194
PMCID: PMC3043089  PMID: 21364947
19.  Towards BioDBcore: a community-defined information specification for biological databases 
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources; and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
doi:10.1093/database/baq027
PMCID: PMC3017395  PMID: 21205783
20.  Meeting Report from the Second “Minimum Information for Biological and Biomedical Investigations” (MIBBI) workshop 
Standards in Genomic Sciences  2010;3(3):259-266.
This report summarizes the proceedings of the second workshop of the ‘Minimum Information for Biological and Biomedical Investigations’ (MIBBI) consortium held on Dec 1-2, 2010 in Rüdesheim, Germany through the sponsorship of the Beilstein-Institute. MIBBI is an umbrella organization uniting communities developing Minimum Information (MI) checklists to standardize the description of data sets, the workflows by which they were generated and the scientific context for the work. This workshop brought together representatives of more than twenty communities to present the status of their MI checklists and plans for future development. Shared challenges and solutions were identified and the role of MIBBI in MI checklist development was discussed. The meeting featured some thirty presentations, wide-ranging discussions and breakout groups. The top outcomes of the two-day workshop as defined by the participants were: 1) the chance to share best practices and to identify areas of synergy; 2) defining a series of tasks for updating the MIBBI Portal; 3) reemphasizing the need to maintain independent MI checklists for various communities while leveraging common terms and workflow elements contained in multiple checklists; and 4) revision of the concept of the MIBBI Foundry to focus on the creation of a core set of MIBBI modules intended for reuse by individual MI checklist projects while maintaining the integrity of each MI project. Further information about MIBBI and its range of activities can be found at http://mibbi.org/.
doi:10.4056/sigs.147362
PMCID: PMC3035314  PMID: 21304730
21.  Advancing standards for bioinformatics activities: persistence, reproducibility, disambiguation and Minimum Information About a Bioinformatics investigation (MIABi) 
BMC Genomics  2010;11(Suppl 4):S27.
The 2010 International Conference on Bioinformatics, InCoB2010, which is the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet) has agreed to publish conference papers in compliance with the proposed Minimum Information about a Bioinformatics investigation (MIABi), proposed in June 2009. Authors of the conference supplements in BMC Bioinformatics, BMC Genomics and Immunome Research have consented to cooperate in this process, which will include the procedures described herein, where appropriate, to ensure data and software persistence and perpetuity, database and resource re-instantiability and reproducibility of results, author and contributor identity disambiguation and MIABi-compliance. Wherever possible, datasets and databases will be submitted to depositories with standardized terminologies. As standards are evolving, this process is intended as a prelude to the 100 BioDatabases (BioDB100) initiative whereby APBioNet collaborators will contribute exemplar databases to demonstrate the feasibility of standards-compliance and participate in refining the process for peer-review of such publications and validation of scientific claims and standards compliance. This testbed represents another step in advancing standards-based processes in the bioinformatics community which is essential to the growing interoperability of biological data, information, knowledge and computational resources.
doi:10.1186/1471-2164-11-S4-S27
PMCID: PMC3005918  PMID: 21143811
22.  Challenges of the next decade for the Asia Pacific region: 2010 International Conference in Bioinformatics (InCoB 2010) 
BMC Genomics  2010;11(Suppl 4):S1.
The 2010 annual conference of the Asia Pacific Bioinformatics Network (APBioNet), Asia’s oldest bioinformatics organisation formed in 1998, was organized as the 9th International Conference on Bioinformatics (InCoB), Sept. 26-28, 2010 in Tokyo, Japan. Initially, APBioNet created InCoB as forum to foster bioinformatics in the Asia Pacific region. Given the growing importance of interdisciplinary research, InCoB2010 included topics targeting scientists in the fields of genomic medicine, immunology and chemoinformatics, supporting translational research. Peer-reviewed manuscripts that were accepted for publication in this supplement, represent key areas of research interests that have emerged in our region. We also highlight some of the current challenges bioinformatics is facing in the Asia Pacific region and conclude our report with the announcement of APBioNet’s 100 BioDatabases (BioDB100) initiative. BioDB100 will comply with the database criteria set out earlier in our proposal for Minimum Information about a Bioinformatics and Investigation (MIABi), setting the standards for biocuration and bioinformatics research, on which we will report at the next InCoB, Nov. 27 – Dec. 2, 2011 at Kuala Lumpur, Malaysia.
doi:10.1186/1471-2164-11-S4-S1
PMCID: PMC3005919  PMID: 21143792
23.  Secretome: clues into pathogen infection and clinical applications 
Genome Medicine  2009;1(11):113.
The secretome encompasses the complete set of gene products secreted by a cell. Recent studies on secretome analysis reveal that secretory proteins play an important role in pathogen infection and host-pathogen interactions. Excretory/secretory proteins of pathogens change the host cell environment by suppressing the immune system, to aid the proliferation of infection. Identifying secretory proteins involved in pathogen infection will lead to the discovery of potential drug targets and biomarkers for diagnostic applications.
doi:10.1186/gm113
PMCID: PMC2808748  PMID: 19951402
24.  Towards BioDBcore: a community-defined information specification for biological databases 
Nucleic Acids Research  2010;39(Database issue):D7-D10.
The present article proposes the adoption of a community-defined, uniform, generic description of the core attributes of biological databases, BioDBCore. The goals of these attributes are to provide a general overview of the database landscape, to encourage consistency and interoperability between resources and to promote the use of semantic and syntactic standards. BioDBCore will make it easier for users to evaluate the scope and relevance of available resources. This new resource will increase the collective impact of the information present in biological databases.
doi:10.1093/nar/gkq1173
PMCID: PMC3013734  PMID: 21097465
25.  InCoB2010 - 9th International Conference on Bioinformatics at Tokyo, Japan, September 26-28, 2010 
BMC Bioinformatics  2010;11(Suppl 7):S1.
The International Conference on Bioinformatics (InCoB), the annual conference of the Asia-Pacific Bioinformatics Network (APBioNet), is hosted in one of countries of the Asia-Pacific region. The 2010 conference was awarded to Japan and has attracted more than one hundred high-quality research paper submissions. Thorough peer reviewing resulted in 47 (43.5%) accepted papers out of 108 submissions. Submissions from Japan, R.O. Korea, P.R. China, Australia, Singapore and U.S.A totaled 43.8% and contributed to 57.4% of accepted papers. Manuscripts originating from Taiwan and India added up to 42.8% of submissions and 28.3% of acceptances. The fifteen articles published in this BMC Bioinformatics supplement cover disease informatics, structural bioinformatics and drug design, biological databases and software tools, signaling pathways, gene regulatory and biochemical networks, evolution and sequence analysis.
doi:10.1186/1471-2105-11-S7-S1
PMCID: PMC2957677  PMID: 21106116

Results 1-25 (62)