Search tips
Search criteria

Results 1-14 (14)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  An Integrative Computational Approach for Prioritization of Genomic Variants 
PLoS ONE  2014;9(12):e114903.
An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.
PMCID: PMC4266634  PMID: 25506935
2.  Lynx web services for annotations and systems analysis of multi-gene disorders 
Nucleic Acids Research  2014;42(Web Server issue):W473-W477.
Lynx is a web-based integrated systems biology platform that supports annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Lynx has integrated multiple classes of biomedical data (genomic, proteomic, pathways, phenotypic, toxicogenomic, contextual and others) from various public databases as well as manually curated data from our group and collaborators (LynxKB). Lynx provides tools for gene list enrichment analysis using multiple functional annotations and network-based gene prioritization. Lynx provides access to the integrated database and the analytical tools via REST based Web Services ( This comprises data retrieval services for specific functional annotations, services to search across the complete LynxKB (powered by Lucene), and services to access the analytical tools built within the Lynx platform.
PMCID: PMC4086124  PMID: 24948611
3.  Lynx: a database and knowledge extraction engine for integrative medicine 
Nucleic Acids Research  2013;42(Database issue):D1007-D1012.
We have developed Lynx (—a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces.
PMCID: PMC3965040  PMID: 24270788
4.  Microvesicles and intercellular communication in the context of parasitism 
There is a rapidly growing body of evidence that production of microvesicles (MVs) is a universal feature of cellular life. MVs can incorporate microRNA (miRNA), mRNA, mtDNA, DNA and retrotransposons, camouflage viruses/viral components from immune surveillance, and transfer cargo between cells. These properties make MVs an essential player in intercellular communication. Increasing evidence supports the notion that MVs can also act as long-distance vehicles for RNA molecules and participate in metabolic synchronization and reprogramming eukaryotic cells including stem and germinal cells. MV ability to carry on DNA and their general distribution makes them attractive candidates for horizontal gene transfer, particularly between multi-cellular organisms and their parasites; this suggests important implications for the co-evolution of parasites and their hosts. In this review, we provide current understanding of the roles played by MVs in intracellular pathogens and parasitic infections. We also discuss the possible role of MVs in co-infection and host shifting.
PMCID: PMC3764926  PMID: 24032108
microvesicles; exosomes; miRNA; parasite; metabolism synchronization; horizontal gene transfer; co-infection; Plasmodium
5.  Copy number variants and infantile spasms: evidence for abnormalities in ventral forebrain development and pathways of synaptic function 
European Journal of Human Genetics  2011;19(12):1238-1245.
Infantile spasms (ISS) are an epilepsy disorder frequently associated with severe developmental outcome and have diverse genetic etiologies. We ascertained 11 subjects with ISS and novel copy number variants (CNVs) and combined these with a new cohort with deletion 1p36 and ISS, and additional published patients with ISS and other chromosomal abnormalities. Using bioinformatics tools, we analyzed the gene content of these CNVs for enrichment in pathways of pathogenesis. Several important findings emerged. First, the gene content was enriched for the gene regulatory network involved in ventral forebrain development. Second, genes in pathways of synaptic function were overrepresented, significantly those involved in synaptic vesicle transport. Evidence also suggested roles for GABAergic synapses and the postsynaptic density. Third, we confirm the association of ISS with duplication of 14q12 and maternally inherited duplication of 15q11q13, and report the association with duplication of 21q21. We also present a patient with ISS and deletion 7q11.3 not involving MAGI2. Finally, we provide evidence that ISS in deletion 1p36 may be associated with deletion of KLHL17 and expand the epilepsy phenotype in that syndrome to include early infantile epileptic encephalopathy. Several of the identified pathways share functional links, and abnormalities of forebrain synaptic growth and function may form a common biologic mechanism underlying both ISS and autism. This study demonstrates a novel approach to the study of gene content in subjects with ISS and copy number variation, and contributes further evidence to support specific pathways of pathogenesis.
PMCID: PMC3230360  PMID: 21694734
infantile spasms; autism; bioinformatics; copy number variation; deletion 1p36 syndrome
7.  BioPAX – A community standard for pathway data sharing 
Demir, Emek | Cary, Michael P. | Paley, Suzanne | Fukuda, Ken | Lemer, Christian | Vastrik, Imre | Wu, Guanming | D’Eustachio, Peter | Schaefer, Carl | Luciano, Joanne | Schacherer, Frank | Martinez-Flores, Irma | Hu, Zhenjun | Jimenez-Jacinto, Veronica | Joshi-Tope, Geeta | Kandasamy, Kumaran | Lopez-Fuentes, Alejandra C. | Mi, Huaiyu | Pichler, Elgar | Rodchenkov, Igor | Splendiani, Andrea | Tkachev, Sasha | Zucker, Jeremy | Gopinath, Gopal | Rajasimha, Harsha | Ramakrishnan, Ranjani | Shah, Imran | Syed, Mustafa | Anwar, Nadia | Babur, Ozgun | Blinov, Michael | Brauner, Erik | Corwin, Dan | Donaldson, Sylva | Gibbons, Frank | Goldberg, Robert | Hornbeck, Peter | Luna, Augustin | Murray-Rust, Peter | Neumann, Eric | Reubenacker, Oliver | Samwald, Matthias | van Iersel, Martijn | Wimalaratne, Sarala | Allen, Keith | Braun, Burk | Whirl-Carrillo, Michelle | Dahlquist, Kam | Finney, Andrew | Gillespie, Marc | Glass, Elizabeth | Gong, Li | Haw, Robin | Honig, Michael | Hubaut, Olivier | Kane, David | Krupa, Shiva | Kutmon, Martina | Leonard, Julie | Marks, Debbie | Merberg, David | Petri, Victoria | Pico, Alex | Ravenscroft, Dean | Ren, Liya | Shah, Nigam | Sunshine, Margot | Tang, Rebecca | Whaley, Ryan | Letovksy, Stan | Buetow, Kenneth H. | Rzhetsky, Andrey | Schachter, Vincent | Sobral, Bruno S. | Dogrusoz, Ugur | McWeeney, Shannon | Aladjem, Mirit | Birney, Ewan | Collado-Vides, Julio | Goto, Susumu | Hucka, Michael | Le Novère, Nicolas | Maltsev, Natalia | Pandey, Akhilesh | Thomas, Paul | Wingender, Edgar | Karp, Peter D. | Sander, Chris | Bader, Gary D.
Nature biotechnology  2010;28(9):935-942.
BioPAX (Biological Pathway Exchange) is a standard language to represent biological pathways at the molecular and cellular level. Its major use is to facilitate the exchange of pathway data ( Pathway data captures our understanding of biological processes, but its rapid growth necessitates development of databases and computational tools to aid interpretation. However, the current fragmentation of pathway information across many databases with incompatible formats presents barriers to its effective use. BioPAX solves this problem by making pathway data substantially easier to collect, index, interpret and share. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks. BioPAX was created through a community process. Through BioPAX, millions of interactions organized into thousands of pathways across many organisms, from a growing number of sources, are available. Thus, large amounts of pathway data are available in a computable form to support visualization, analysis and biological discovery.
PMCID: PMC3001121  PMID: 20829833
pathway data integration; pathway database; standard exchange format; ontology; information system
8.  The minimum information about a genome sequence (MIGS) specification 
Nature biotechnology  2008;26(5):541-547.
With the quantity of genomic data increasing at an exponential rate, it is imperative that these data be captured electronically, in a standard format. Standardization activities must proceed within the auspices of open-access and international working bodies. To tackle the issues surrounding the development of better descriptions of genomic investigations, we have formed the Genomic Standards Consortium (GSC). Here, we introduce the minimum information about a genome sequence (MIGS) specification with the intent of promoting participation in its development and discussing the resources that will be required to develop improved mechanisms of metadata capture and exchange. As part of its wider goals, the GSC also supports improving the ‘transparency’ of the information contained in existing genomic databases.
PMCID: PMC2409278  PMID: 18464787
9.  Identification of Francisella tularensis Himar1-Based Transposon Mutants Defective for Replication in Macrophages▿  
Infection and Immunity  2007;75(11):5376-5389.
Francisella tularensis, the etiologic agent of tularemia in humans, is a potential biological threat due to its low infectious dose and multiple routes of entry. F. tularensis replicates within several cell types, eventually causing cell death by inducing apoptosis. In this study, a modified Himar1 transposon (HimarFT) was used to mutagenize F. tularensis LVS. Approximately 7,000 Kmr clones were screened using J774A.1 macrophages for reduction in cytopathogenicity based on retention of the cell monolayer. A total of 441 candidates with significant host cell retention compared to the parent were identified following screening in a high-throughput format. Retesting at a defined multiplicity of infection followed by in vitro growth analyses resulted in identification of approximately 70 candidates representing 26 unique loci involved in macrophage replication and/or cytotoxicity. Mutants carrying insertions in seven hypothetical genes were screened in a mouse model of infection, and all strains tested appeared to be attenuated, which validated the initial in vitro results obtained with cultured macrophages. Complementation and reverse transcription-PCR experiments suggested that the expression of genes adjacent to the HimarFT insertion may be affected depending on the orientation of the constitutive groEL promoter region used to ensure transcription of the selective marker in the transposon. A hypothetical gene, FTL_0706, postulated to be important for lipopolysaccharide biosynthesis, was confirmed to be a gene involved in O-antigen expression in F. tularensis LVS and Schu S4. These and other studies demonstrate that therapeutic targets, vaccine candidates, or virulence-related genes may be discovered utilizing classical genetic approaches in Francisella.
PMCID: PMC2168294  PMID: 17682043
10.  Sentra: a database of signal transduction proteins for comparative genome analysis 
Nucleic Acids Research  2006;35(Database issue):D271-D273.
Sentra (), a database of signal transduction proteins encoded in completely sequenced prokaryotic genomes, has been updated to reflect recent advances in understanding signal transduction events on a whole-genome scale. Sentra consists of two principal components, a manually curated list of signal transduction proteins in 202 completely sequenced prokaryotic genomes and an automatically generated listing of predicted signaling proteins in 235 sequenced genomes that are awaiting manual curation. In addition to two-component histidine kinases and response regulators, the database now lists manually curated Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews. All entries in Sentra are extensively annotated with relevant information from public databases (e.g. UniProt, KEGG, PDB and NCBI). Sentra's infrastructure was redesigned to support interactive cross-genome comparisons of signal transduction capabilities of prokaryotic organisms from a taxonomic and phenotypic perspective and in the framework of signal transduction pathways from KEGG. Sentra leverages the PUMA2 system to support interactive analysis and annotation of signal transduction proteins by the users.
PMCID: PMC1751548  PMID: 17135204
11.  PUMA2—grid-based high-throughput analysis of genomes and metabolic pathways 
Nucleic Acids Research  2005;34(Database issue):D369-D372.
The PUMA2 system (available at ) is an interactive, integrated bioinformatics environment for high-throughput genetic sequence analysis and metabolic reconstructions from sequence data. PUMA2 provides a framework for comparative and evolutionary analysis of genomic data and metabolic networks in the context of taxonomic and phenotypic information. Grid infrastructure is used to perform computationally intensive tasks. PUMA2 currently contains precomputed analysis of 213 prokaryotic, 22 eukaryotic, 650 mitochondrial and 1493 viral genomes and automated metabolic reconstructions for >200 organisms. Genomic data is annotated with information integrated from >20 sequence, structural and metabolic databases and ontologies. PUMA2 supports both automated and interactive expert-driven annotation of genomes, using a variety of publicly available bioinformatics tools. It also contains a suite of unique PUMA2 tools for automated assignment of gene function, evolutionary analysis of protein families and comparative analysis of metabolic pathways. PUMA2 allows users to submit batch sequence data for automated functional analysis and construction of metabolic models. The results of these analyses are made available to the users in the PUMA2 environment for further interactive sequence analysis and annotation.
PMCID: PMC1347457  PMID: 16381888
12.  Sentra, a database of signal transduction proteins 
Nucleic Acids Research  2002;30(1):349-350.
Sentra ( is a database of signal transduction proteins with the emphasis on microbial signal transduction. The database was updated to include classes of signal transduction systems modulated by either phosphorylation or methylation reactions such as PAS proteins and serine/threonine kinases, as well as the classical two-component histidine kinases and methyl-accepting chemotaxis proteins. Currently, Sentra contains signal transduction proteins from 43 completely sequenced prokaryotic genomes as well as sequences from SWISS-PROT and TrEMBL. Signal transduction proteins are annotated with information describing conserved domains, paralogous and orthologous sequences, and conserved chromosomal gene clusters. The newly developed user interface supports flexible search capabilities and extensive visualization of the data.
PMCID: PMC99115  PMID: 11752334
13.  WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstruction 
Nucleic Acids Research  2000;28(1):123-125.
The WIT (What Is There) ( ) system has been designed to support comparative analysis of sequenced genomes and to generate metabolic reconstructions based on chromosomal sequences and metabolic modules from the EMP/MPW family of databases. This system contains data derived from about 40 completed or nearly completed genomes. Sequence homologies, various ORF-clustering algorithms, relative gene positions on the chromosome and placement of gene products in metabolic pathways (metabolic reconstruction) can be used for the assignment of gene functions and for development of overviews of genomes within WIT. The integration of a large number of phylogenetically diverse genomes in WIT facilitates the understanding of the physiology of different organisms.
PMCID: PMC102471  PMID: 10592199
14.  SENTRA, a database of signal transduction proteins 
Nucleic Acids Research  2000;28(1):335-336.
SENTRA, available via URL , is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.
PMCID: PMC102390  PMID: 10592266

Results 1-14 (14)