PMCC PMCC

Aide
Les critères de recherche

Avancée
Résultats 1-25 (108)
 

Notices sélectionnées (0)
Aucune

Sélectionner un filtre

Revues
Année de publication
1.  Beyond Field Effect: Analysis of Shrunken Centroids in Normal Esophageal Epithelia Detects Concomitant Esophageal Adenocarcinoma 
Background and Aims:
Because of the extremely low neoplastic progression rate in Barrett’s esophagus, it is difficult to diagnose patients with concomitant adenocarcinoma early in their disease course. If biomarkers existed in normal squamous esophageal epithelium to identify patients with concomitant esophageal adenocarcinoma, potential applications would be far-reaching. The aim of the current study was to identify global gene expression patterns in normal esophageal epithelium capable of revealing simultaneous esophageal adenocarcinoma, even located remotely in the esophagus.
Methods:
Tissues comprised normal esophageal epithelia from 9 patients with esophageal adenocarcinoma, 8 patients lacking esophageal adenocarcinoma or Barrett’s, and 6 patients with Barrett’s esophagus alone. cDNA microarrays were performed, and pattern recognition in each of these subgroups was achieved using shrunken nearest centroid predictors.
Results:
Our method accurately discriminated normal esophageal epithelia of 8/8 patients without esophageal adenocarcinoma or Barrett’s esophagus and of 6/6 patients with Barrett’s esophagus alone from normal esophageal epithelia of 9/9 patients with Barrett’s esophagus and concomitant esophageal adenocarcinoma. Moreover, we identified genes differentially expressed between the above subgroups. Thus, based on their corresponding normal esophageal epithelia alone, our method accurately diagnosed patients who had concomitant esophageal adenocarcinoma.
Conclusions:
These global gene expression patterns, along with individual genes culled from them, represent potential biomarkers for the early diagnosis of esophageal adenocarcinoma from normal esophageal epithelia. Genes discovered in normal esophagus that are differentially expressed in patients with vs. without esophageal adenocarcinoma merit further pursuit in molecular genetic, functional, and therapeutic interventional studies.
PMCID: PMC2323355  PMID: 18425214
2.  Rarity of Somatic Mutation and Frequency of Normal Sequence Variation Detected in Sporadic Colon Adenocarcinoma Using High-Throughput cDNA Sequencing 
We performed high-throughput cDNA sequencing in colorectal adenocarcinoma and matching normal colorectal epithelium. All six hundred three genes in the UCSC database that were expressed in colon cancers and contained open reading frames of 1000 nucleotides or less were selected for study (total basepairs/bp, 366,686). 304,350 of these 366,686 bp (83.0%) were amplified and sequenced successfully. Seventy-eight sequence variants present in germline (i.e. normal) as well as matching somatic (i.e. tumor) DNA were discovered, yielding a frequency of 1 variant per 3,902 bp. Fifty-one of these sequence variants were homozygous (26 synonymous, 25 non-synonymous), while 27 were heterozygous (11 synonymous, 16 non-synonymous). Cancer tissue contained only one sequence-altered allele of the gene ATP50, which was present heterozygously alongside the wild-type allele in matching normal epithelium. Despite this relatively large number of bp and genes sequenced, no somatic mutations unique to tumor were found. High-throughput cDNA sequencing is a practical approach for detecting novel sequence variations and alterations in human tumors, such as those of the colon.
PMCID: PMC2287164  PMID: 18389087
3.  Beyond Field Effect: Analysis of Shrunken Centroids in Normal Esophageal Epithelia Detects Concomitant Esophageal Adenocarcinoma 
Background and Aims
Because of the extremely low neoplastic progression rate in Barrett’s esophagus, it is difficult to diagnose patients with concomitant adenocarcinoma early in their disease course. If biomarkers existed in normal squamous esophageal epithelium to identify patients with concomitant esophageal adenocarcinoma, potential applications would be far-reaching. The aim of the current study was to identify global gene expression patterns in normal esophageal epithelium capable of revealing simultaneous esophageal adenocarcinoma, even located remotely in the esophagus.
Methods
Tissues comprised normal esophageal epithelia from 9 patients with esophageal adenocarcinoma, 8 patients lacking esophageal adenocarcinoma or Barrett’s, and 6 patients with Barrett’s esophagus alone. cDNA microarrays were performed, and pattern recognition in each of these subgroups was achieved using shrunken nearest centroid predictors.
Results
Our method accurately discriminated normal esophageal epithelia of 8/8 patients without esophageal adenocarcinoma or Barrett’s esophagus and of 6/6 patients with Barrett’s esophagus alone from normal esophageal epithelia of 9/9 patients with Barrett’s esophagus and concomitant esophageal adenocarcinoma. Moreover, we identified genes differentially expressed between the above subgroups. Thus, based on their corresponding normal esophageal epithelia alone, our method accurately diagnosed patients who had concomitant esophageal adenocarcinoma.
Conclusions
These global gene expression patterns, along with individual genes culled from them, represent potential biomarkers for the early diagnosis of esophageal adenocarcinoma from normal esophageal epithelia. Genes discovered in normal esophagus that are differentially expressed in patients with vs. without esophageal adenocarcinoma merit further pursuit in molecular genetic, functional, and therapeutic interventional studies.
PMCID: PMC2323355  PMID: 18425214
4.  The HapMap Resource is Providing New Insights into Ourselves and its Application to Pharmacogenomics 
The exploration of quantitative variation in complex traits such as gene expression and drug response in human populations has become one of the major priorities for medical genetics. The International HapMap Project provides a key resource of genotypic data on human lymphoblastoid cell lines derived from four major world populations of European, African, Chinese and Japanese ancestry for researchers to associate with various phenotypic data to find genes affecting health, disease and response to drugs. Recent progress in dissecting genetic contribution to natural variation in gene expression within and among human populations and variation in drug response are two examples in which researchers have utilized the HapMap resource. The HapMap Project provides new insights into the human genome and has applicability to pharmacogenomics studies leading to personalized medicine.
PMCID: PMC2288550  PMID: 18392109
HapMap; Lymphoblastoid cell lines; Genotype; Gene expression; Population genetics
5.  Rarity of Somatic Mutation and Frequency of Normal Sequence Variation Detected in Sporadic Colon Adenocarcinoma Using High-Throughput cDNA Sequencing 
We performed high-throughput cDNA sequencing in colorectal adenocarcinoma and matching normal colorectal epithelium. All six hundred three genes in the UCSC database that were expressed in colon cancers and contained open reading frames of 1000 nucleotides or less were selected for study (total basepairs/bp, 366,686). 304,350 of these 366,686 bp (83.0%) were amplified and sequenced successfully. Seventy-eight sequence variants present in germline (i.e. normal) as well as matching somatic (i.e. tumor) DNA were discovered, yielding a frequency of 1 variant per 3,902 bp. Fifty-one of these sequence variants were homozygous (26 synonymous, 25 non-synonymous), while 27 were heterozygous (11 synonymous, 16 non-synonymous). Cancer tissue contained only one sequence-altered allele of the gene ATP50, which was present heterozygously alongside the wild-type allele in matching normal epithelium. Despite this relatively large number of bp and genes sequenced, no somatic mutations unique to tumor were found. High-throughput cDNA sequencing is a practical approach for detecting novel sequence variations and alterations in human tumors, such as those of the colon.
PMCID: PMC2287164  PMID: 18389087
6.  The HapMap Resource is Providing New Insights into Ourselves and its Application to Pharmacogenomics 
The exploration of quantitative variation in complex traits such as gene expression and drug response in human populations has become one of the major priorities for medical genetics. The International HapMap Project provides a key resource of genotypic data on human lymphoblastoid cell lines derived from four major world populations of European, African, Chinese and Japanese ancestry for researchers to associate with various phenotypic data to find genes affecting health, disease and response to drugs. Recent progress in dissecting genetic contribution to natural variation in gene expression within and among human populations and variation in drug response are two examples in which researchers have utilized the HapMap resource. The HapMap Project provides new insights into the human genome and has applicability to pharmacogenomics studies leading to personalized medicine.
PMCID: PMC2288550  PMID: 18392109
HapMap; lymphoblastoid cell lines; genotype; gene expression; population genetics
7.  Computational Small RNA Prediction in Bacteria 
Bacterial, small RNAs were once regarded as potent regulators of gene expression and are now being considered as essential for their diversified roles. Many small RNAs are now reported to have a wide array of regulatory functions, ranging from environmental sensing to pathogenesis. Traditionally, noncoding transcripts were rarely detected by means of genetic screens. However, the availability of approximately 2200 prokaryotic genome sequences in public databases facilitates the efficient computational search of those molecules, followed by experimental validation. In principle, the following four major computational methods were applied for the prediction of sRNA locations from bacterial genome sequences: (1) comparative genomics, (2) secondary structure and thermodynamic stability, (3) ‘Orphan’ transcriptional signals and (4) ab initio methods regardless of sequence or structure similarity; most of these tools were applied to locate the putative genomic sRNA locations followed by experimental validation of those transcripts. Therefore, computational screening has simplified the sRNA identification process in bacteria. In this review, a plethora of small RNA prediction methods and tools that have been reported in the past decade are discussed comprehensively and assessed based on their attributes, compatibility, and their prediction accuracy.
doi:10.4137/BBI.S11213
PMCID: PMC3596055
comparative genomics; base composition; ncRNA; sRNA prediction; structure stability; transcriptional signal
8.  Construction of a Computable Network Model for DNA Damage, Autophagy, Cell Death, and Senescence 
Towards the development of a systems biology-based risk assessment approach for environmental toxicants, including tobacco products in a systems toxicology setting such as the “21st Century Toxicology”, we are building a series of computable biological network models specific to non-diseased pulmonary and cardiovascular cells/tissues which capture the molecular events that can be activated following exposure to environmental toxicants. Here we extend on previous work and report on the construction and evaluation of a mechanistic network model focused on DNA damage response and the four main cellular fates induced by stress: autophagy, apoptosis, necroptosis, and senescence. In total, the network consists of 34 sub-models containing 1052 unique nodes and 1538 unique edges which are supported by 1231 PubMed-referenced literature citations. Causal node-edge relationships are described using the Biological Expression Language (BEL), which allows for the semantic representation of life science relationships in a computable format. The Network is provided in .XGMML format and can be viewed using freely available network visualization software, such as Cytoscape.
doi:10.4137/BBI.S11154
PMCID: PMC3596057
computable; network model; DNA damage; autophagy; apoptosis; necroptosis; senescence; Biological Expression Language (BEL)
9.  Integrative Approach for Computationally Inferring Interactions between the Alpha and Beta Subunits of the Calcium-Activated Potassium Channel (BK): a Docking Study 
Three-dimensional models of the alpha- and beta-1 subunits of the calcium-activated potassium channel (BK) were predicted by threading modeling. A recursive approach comprising of sequence alignment and model building based on three templates was used to build these models, with the refinement of non-conserved regions carried out using threading techniques. The complex formed by the subunits was studied by means of docking techniques, using 3D models of the two subunits, and an approach based on rigid-body structures. Structural effects of the complex were analyzed with respect to hydrogen-bond interactions and binding-energy calculations. Potential interaction sites of the complex were determined by referencing a study of the difference accessible surface area (DASA) of the protein subunits in the complex.
doi:10.4137/BBI.S10077
PMCID: PMC3588595  PMID: 23492851
potassium channel; docking; moelcular interactions; binding energy
10.  A Statistical Method without Training Step for the Classification of Coding Frame in Transcriptome Sequences 
In this study, we investigated the modalities of coding open reading frame (cORF) classification of expressed sequence tags (EST) by using the universal feature method (UFM). The UFM algorithm is based on the scoring of purine bias (Rrr) and stop codon frequencies. UFM classifies ORFs as coding or non-coding through a score based on 5 factors: (i) stop codon frequency; (ii) the product of the probabilities of purines occurring in the three positions of nucleotide triplets; (iii) the product of the probabilities of Cytosine (C), Guanine (G), and Adenine (A) occurring in the 1st, 2nd, and 3rd positions of triplets, respectively; (iv) the probabilities of a G occurring in the 1st and 2nd positions of triplets; and (v) the probabilities of a T occurring in the 1st and an A in the 2nd position of triplets. Because UFM is based on primary determinants of coding sequences that are conserved throughout the biosphere, it is suitable for cORF classification of any sequence in eukaryote transcriptomes without prior knowledge. Considering the protein sequences of the Protein Data Bank (RCSB PDB or more simply PDB) as a reference, we found that UFM classifies cORFs of ≥200 bp (if the coding strand is known) and cORFs of ≥300 bp (if the coding strand is unknown), and releases them in their coding strand and coding frame, which allows their automatic translation into protein sequences with a success rate equal to or higher than 95%. We first established the statistical parameters of UFM using ESTs from Plasmodium falciparum, Arabidopsis thaliana, Oryza sativa, Zea mays, Drosophila melanogaster, Homo sapiens and Chlamydomonas reinhardtii in reference to the protein sequences of PDB. Second, we showed that the success rate of cORF classification using UFM is expected to apply to approximately 95% of higher eukaryote genes that encode for proteins. Third, we used UFM in combination with CAP3 to assemble large EST samples into cORFs that we used to analyze transcriptome phenotypes in rice, maize, and humans. We discuss the error rate and the interference of noisy sequences such as pseudogenes, transposons, and retrotransposons. This method is suitable for rapid cORF extraction from transcriptome data and allows correct description of the genome phenotypes of plant genomes without prior knowledge. Additional care is necessary when addressing the human transcriptome due to the interference caused by large amounts of noisy sequences. UFM can be regarded as a low complexity tool for prior knowledge extraction concerning the coding fraction of the transcriptome of any eukaryote. Due to its low level of complexity, UFM is also very robust to variations of codon usage.
doi:10.4137/BBI.S10053
PMCID: PMC3561939  PMID: 23400232
genomics; RNY; EST; ORF; CDS; UFM; classification
11.  PREPACT 2.0: Predicting C-to-U and U-to-C RNA Editing in Organelle Genome Sequences with Multiple References and Curated RNA Editing Annotation 
RNA editing is vast in some genetic systems, with up to thousands of targeted C-to-U and U-to-C substitutions in mitochondria and chloroplasts of certain plants. Efficient prognoses of RNA editing in organelle genomes will help to reveal overlooked cases of editing. We present PREPACT 2.0 (http://www.prepact.de) with numerous enhancements of our previously developed Plant RNA Editing Prediction & Analysis Computer Tool. Reference organelle transcriptomes for editing prediction have been extended and reorganized to include 19 curated mitochondrial and 13 chloroplast genomes, now allowing to distinguish RNA editing sites from “pre-edited” sites. Queries may be run against multiple references and a new “commons” function identifies and highlights orthologous candidate editing sites congruently predicted by multiple references. Enhancements to the BLASTX mode in PREPACT 2.0 allow querying of complete novel organelle genomes within a few minutes, identifying protein genes and candidate RNA editing sites simultaneously without prior user analyses.
doi:10.4137/BBI.S11059
PMCID: PMC3547502  PMID: 23362369
pyrimidine substitutions; RNA editing prediction; plants; protists; mitochondrial DNA; chloroplast DNA; BLASTX
12.  Identification and Insilico Analysis of Retinoblastoma Serum microRNA Profile and Gene Targets Towards Prediction of Novel Serum Biomarkers 
Retinoblastoma (RB) is a malignant tumor of the retina seen in children, and potential non invasive biomarkers are in need for rapid diagnosis and for prognosticating the therapy. This study was undertaken to identify the differentially expressed miRNAs in the serum of children with RB in comparison with the normal age matched serum, to analyze its concurrence with the existing RB tumor miRNA profile, to identify its novel gene targets specific to RB, and to study the expression of a few of the identified oncogenic miRNAs in the advanced stage primary RB patient’s serum sample. MiRNA profiling was performed on 14 pooled serum from children with advanced RB and 14 normal age matched serum samples, wherein 21 miRNAs were found to be upregulated (fold change ≤ −2.0, P ≤ 0.05) and 24 to be downregulated (fold change ≥ +2.0, P ≤ 0.05). Furthermore, intersection of 59 significantly deregulated miRNAs identified from RB tumor profiles with that of miRNAs detected in serum profile revealed that 33 miRNAs had followed a similar deregulation pattern in RB serum. Later we validated a few of the miRNAs (miRNA 17-92) identified by microarray in the RB patient serum samples (n = 20) by using qRT-PCR. Expression of the oncogenic miRNAs, miR-17, miR-18a, and miR-20a by qRT-PCR was significant in the serum samples exploring the potential of serum miRNAs identification as noninvasive diagnosis. Moreover, from miRNA gene target prediction, key regulatory genes of cell proliferation, apoptosis, and positive and negative regulatory networks involved in RB progression were identified in the gene expression profile of RB tumors. Therefore, these identified miRNAs and their corresponding target genes could give insights on potential biomarkers and key events involved in the RB pathway.
doi:10.4137/BBI.S10501
PMCID: PMC3547501  PMID: 23400111
retinoblastoma; micro RNA; biomarkers; bioinformatics tools
13.  Methods of Combinatorial Optimization to Reveal Factors Affecting Gene Length 
In this paper we present a novel method for genome ranking according to gene lengths. The main outcomes described in this paper are the following: the formulation of the genome ranking problem, presentation of relevant approaches to solve it, and the demonstration of preliminary results from prokaryotic genomes ordering. Using a subset of prokaryotic genomes, we attempted to uncover factors affecting gene length. We have demonstrated that hyperthermophilic species have shorter genes as compared with mesophilic organisms, which probably means that environmental factors affect gene length. Moreover, these preliminary results show that environmental factors group together in ranking evolutionary distant species.
doi:10.4137/BBI.S10525
PMCID: PMC3528112  PMID: 23300345
adaptation; evolution of prokaryotes; orthologs; machine learning; dimension-reduction techniques; factor analysis; clustering; rating; ranking
14.  Analyzing Thiol-Dependent Redox Networks in the Presence of Methylene Blue and Other Antimalarial Agents with RT-PCR-Supported in silico Modeling 
Background
In the face of growing resistance in malaria parasites to drugs, pharmacological combination therapies are important. There is accumulating evidence that methylene blue (MB) is an effective drug against malaria. Here we explore the biological effects of both MB alone and in combination therapy using modeling and experimental data.
Results
We built a model of the central metabolic pathways in P. falciparum. Metabolic flux modes and their changes under MB were calculated by integrating experimental data (RT-PCR data on mRNAs for redox enzymes) as constraints and results from the YANA software package for metabolic pathway calculations. Several different lines of MB attack on Plasmodium redox defense were identified by analysis of the network effects. Next, chloroquine resistance based on pfmdr/and pfcrt transporters, as well as pyrimethamine/sulfadoxine resistance (by mutations in DHF/DHPS), were modeled in silico. Further modeling shows that MB has a favorable synergism on antimalarial network effects with these commonly used antimalarial drugs.
Conclusions
Theoretical and experimental results support that methylene blue should, because of its resistance-breaking potential, be further tested as a key component in drug combination therapy efforts in holoendemic areas.
doi:10.4137/BBI.S10193
PMCID: PMC3516044  PMID: 23236254
methylene blue; resistance; drug; elementary mode analysis; malaria; combination therapy; pathway; metabolic flux
15.  Functional Annotation Analytics of Bacillus Genomes Reveals Stress Responsive Acetate Utilization and Sulfate Uptake in the Biotechnologically Relevant Bacillus megaterium 
Bacillus species form an heterogeneous group of Gram-positive bacteria that include members that are disease-causing, biotechnologically-relevant, and can serve as biological research tools. A common feature of Bacillus species is their ability to survive in harsh environmental conditions by formation of resistant endospores. Genes encoding the universal stress protein (USP) domain confer cellular and organismal survival during unfavorable conditions such as nutrient depletion. As of February 2012, the genome sequences and a variety of functional annotations for at least 123 Bacillus isolates including 45 Bacillus cereus isolates were available in public domain bioinformatics resources. Additionally, the genome sequencing status of 10 of the B. cereus isolates were annotated as finished with each genome encoded 3 USP genes. The conservation of gene neighborhood of the 140 aa universal stress protein in the B. cereus genomes led to the identification of a predicted plasmid-encoded transcriptional unit that includes a USP gene and a sulfate uptake gene in the soil-inhabiting Bacillus megaterium. Gene neighborhood analysis combined with visual analytics of chemical ligand binding sites data provided knowledge-building biological insights on possible cellular functions of B. megaterium universal stress proteins. These functions include sulfate and potassium uptake, acid extrusion, cellular energy-level sensing, survival in high oxygen conditions and acetate utilization. Of particular interest was a two-gene transcriptional unit that consisted of genes for a universal stress protein and a sirtuin Sir2 (deacetylase enzyme for NAD+-dependent acetate utilization). The predicted transcriptional units for stress responsive inorganic sulfate uptake and acetate utilization could explain biological mechanisms for survival of soil-inhabiting Bacillus species in sulfate and acetate limiting conditions. Considering the key role of sirtuins in mammalian physiology additional research on the USP-Sir2 transcriptional unit of B. megaterium could help explain mammalian acetate metabolism in glucose-limiting conditions such as caloric restriction. Finally, the deep-rooted position of B. megaterium in the phylogeny of Bacillus species makes the investigation of the functional coupling acetate utilization and stress response compelling.
doi:10.4137/BBI.S7977
PMCID: PMC3511254  PMID: 23226010
ATP-binding; acetate utilization; Bacillus; Bacillus cereus; Bacillus megaterium; Sir2; sirtuins; sulfate uptake; universal stress proteins
16.  Homology Modeling and Molecular Dynamics Simulation Studies of a Marine Alkaline Protease 
A cold-adapted marine alkaline protease (MP, accession no. ACY25898) was produced by a marine bacterium strain, which was isolated from Yellow Sea sediment in China. Many previous researches showed that this protease had potential application as a detergent additive. It was therefore crucial to determine the tertiary structure of MP. In this study, a homology model of MP was constructed using the multiple templates alignment method. The tools PROCHECK, ERRAT, and Verify_3D were used to check the effectiveness of the model. The result showed that 94% of residues were found in the most favored allowed regions, 6% were in the additional allowed region, and 96.50% of the residues had average 3D-1D scores of no less than 0.2. Meanwhile, the overall quality factor (ERRAT) of our model was 80.657. In this study, we also focused on elucidating the molecular mechanism of the two “flap” motions. Based on the optimized model, molecular-dynamics simulations in explicit solvent environments were carried out by using the AMBER11 package, for the entire protein, in order to characterize the dynamical behavior of the two flaps. Our results showed an open motion of the two flaps in the water solvent. This research may facilitate inhibitor virtual screening for MP and may also lay the foundation knowledge of mechanism of the inhibitors.
doi:10.4137/BBI.S10663
PMCID: PMC3511052  PMID: 23226008
marine alkaline protease; homology modeling; molecular dynamic simulation; zinc-metalprotease; explicit water
17.  A Fast Quad-Tree Based Two Dimensional Hierarchical Clustering 
Recently, microarray technologies have become a robust technique in the area of genomics. An important step in the analysis of gene expression data is the identification of groups of genes disclosing analogous expression patterns. Cluster analysis partitions a given dataset into groups based on specified features. Euclidean distance is a widely used similarity measure for gene expression data that considers the amount of changes in gene expression. However, the huge number of genes and the intricacy of biological networks have highly increased the challenges of comprehending and interpreting the resulting group of data, increasing processing time. The proposed technique focuses on a QT based fast 2-dimensional hierarchical clustering algorithm to perform clustering. The construction of the closest pair data structure is an each level is an important time factor, which determines the processing time of clustering. The proposed model reduces the processing time and improves analysis of gene expression data.
doi:10.4137/BBI.S10383
PMCID: PMC3511054  PMID: 23226009
clustering; euclidean distance; quad tree; hierarchical clustering
18.  Myosinome: A Database of Myosins from Select Eukaryotic Genomes to Facilitate Analysis of Sequence-Structure-Function Relationships 
Myosins are one of the largest protein superfamilies with 24 classes. They have conserved structural features and catalytic domains yet show huge variation at different domains resulting in a variety of functions. Myosins are molecules driving various kinds of cellular processes and motility until the level of organisms. These are ATPases that utilize the chemical energy released by ATP hydrolysis to bring about conformational changes leading to a motor function. Myosins are important as they are involved in almost all cellular activities ranging from cell division to transcriptional regulation. They are crucial due to their involvement in many congenital diseases symptomatized by muscular malfunctions, cardiac diseases, deafness, neural and immunological dysfunction, and so on, many of which lead to death at an early age. We present Myosinome, a database of selected myosin classes (myosin II, V, and VI) from five model organisms. This knowledge base provides the sequences, phylogenetic clustering, domain architectures of myosins and molecular models, structural analyses, and relevant literature of their coiled-coil domains. In the current version of Myosinome, information about 71 myosin sequences belonging to three myosin classes (myosin II, V, and VI) in five model organisms (Homo Sapiens, Mus musculus, D. melanogaster, C. elegans and S. cereviseae) identified using bioinformatics surveys are presented, and several of them are yet to be functionally characterized. As these proteins are involved in congenital diseases, such a database would be useful in short-listing candidates for gene therapy and drug development. The database can be accessed from http://caps.ncbs.res.in/myosinome.
doi:10.4137/BBI.S9902
PMCID: PMC3503467  PMID: 23189029
myosin; Myosinome; myosin II; myosin V; myosin VI; myosin database
19.  BioNetwork Bench: Database and Software for Storage, Query, and Analysis of Gene and Protein Networks 
Gene and protein networks offer a powerful approach for integration of the disparate yet complimentary types of data that result from high-throughput analyses. Although many tools and databases are currently available for accessing such data, they are left unutilized by bench scientists as they generally lack features for effective analysis and integration of both public and private datasets and do not offer an intuitive interface for use by scientists with limited computational expertise. We describe BioNetwork Bench, an open source, user-friendly suite of database and software tools for constructing, querying, and analyzing gene and protein network models. It enables biologists to analyze public as well as private gene expression; interactively query gene expression datasets; integrate data from multiple networks; store and selectively share the data and results. Finally, we describe an application of BioNetwork Bench to the assembly and iterative expansion of a gene network that controls the differentiation of retinal progenitor cells into rod photoreceptors. The tool is available from http://bionetworkbench.sourceforge.net/
Background
The emergence of high-throughput technologies has allowed many biological investigators to collect a great deal of information about the behavior of genes and gene products over time or during a particular disease state. Gene and protein networks offer a powerful approach for integration of the disparate yet complimentary types of data that result from such high-throughput analyses. There are a growing number of public databases, as well as tools for visualization and analysis of networks. However, such databases and tools have yet to be widely utilized by bench scientists, as they generally lack features for effective analysis and integration of both public and private datasets and do not offer an intuitive interface for use by biological scientists with limited computational expertise.
Results
We describe BioNetwork Bench, an open source, user-friendly suite of database and software tools for constructing, querying, and analyzing gene and protein network models. BioNetwork Bench currently supports a broad class of gene and protein network models (eg, weighted and un-weighted, undirected graphs, multi-graphs). It enables biologists to analyze public as well as private gene expression, macromolecular interaction and annotation data; interactively query gene expression datasets; integrate data from multiple networks; query multiple networks for interactions of interest; store and selectively share the data as well as results of analyses. BioNetwork Bench is implemented as a plug-in for, and hence is fully interoperable with, Cytoscape, a popular open-source software suite for visualizing macromolecular interaction networks. Finally, we describe an application of BioNetwork Bench to the problem of assembly and iterative expansion of a gene network that controls the differentiation of retinal progenitor cells into rod photoreceptors.
Conclusions
BioNetwork Bench provides a suite of open source software for construction, querying, and selective sharing of gene and protein networks. Although initially aimed at a community of biologists interested in retinal development, the tool can be adapted easily to work with other biological systems simply by populating the associated database with the relevant datasets.
doi:10.4137/BBI.S9728
PMCID: PMC3498971
network analysis; software; network contruction; network integration
20.  The Role of Electrostatic Interactions on Klentaq1 Insight for Domain Separation 
We investigated the relationship between the thermostability of Klentaq1 and factors stabilizing interdomain interactions. When thermal adaptation of Klentaq1 was analyzed at the atomic level, the protein was stable at 300 and 350 K. It gradually unfolded at 373 K and almost spontaneously unfolded at 400 K. Domain separation was induced by disrupting electrostatic interactions in two salt bridges formed by Lys354-Glu445 and Asp371-Arg435 on the interface domain. The role of these interactions in protein stability was evaluated by comparing free energy solvation (ΔΔGsolv) between wild type and mutants. Substitution of Asp371 by Glu or Asn, and also Glu445 by Asn resulted in a positive value of ΔΔGsolv, suggesting that mutations destabilized the protein structure. Nevertheless, substitution of Glu445 by Asp gave a negative value to ΔΔGsolv reflecting increasing protein stability. Our results demonstrate that interactions at the interface domains of Klentaq1 are essential factors correlated with the Klentaq1 thermostability.
doi:10.4137/BBI.S9390
PMCID: PMC3491847  PMID: 23136465
Klentaq1; domain separation; electrostatic interaction; in silico mutation
21.  Integrative Structural Modelling of the Cardiac Thin Filament: Energetics at the Interface and Conservation Patterns Reveal a Spotlight on Period 2 of Tropomyosin 
Cardiomyopathies are a major health problem, with inherited cardiomyopathies, many of which are caused by mutations in genes encoding sarcomeric proteins, constituting an ever-increasing fraction of cases. To begin to study the mechanisms by which these mutations cause disease, we have employed an integrative modelling approach to study the interactions between tropomyosin and actin. Starting from the existing blocked state model, we identified a specific zone on the actin surface which is highly favourable to support tropomyosin sliding from the blocked/closed states to the open state. We then analysed the predicted actin-tropomyosin interface regions for the three states. Each quasi-repeat of tropomyosin was studied for its interaction strength and evolutionary conservation to focus on smaller surface zones. Finally, we show that the distribution of the known cardiomyopathy mutations of α-tropomyosin is consistent with our model. This analysis provides structural insights into the possible mode of interactions between tropomyosin and actin in the open state for the first time.
Video Abstract
Video Abstract Available from http://la-press.com/t.php?i=9798
doi:10.4137/BBI.S9798
PMCID: PMC3468436  PMID: 23071391
cardiomyopathy mutations; cardiac thin filament; modeling; tropomyosin; period 2
22.  The Core Mouse Response to Infection by Neospora Caninum Defined by Gene Set Enrichment Analyses 
In this study, the BALB/c and Qs mouse responses to infection by the parasite Neospora caninum were investigated in order to identify host response mechanisms. Investigation was done using gene set (enrichment) analyses of microarray data. GSEA, MANOVA, Romer, subGSE and SAM-GS were used to study the contrasts Neospora strain type, Mouse type (BALB/c and Qs) and time post infection (6 hours post infection and 10 days post infection). The analyses show that the major signal in the core mouse response to infection is from time post infection and can be defined by gene ontology terms Protein Kinase Activity, Cell Proliferation and Transcription Initiation. Several terms linked to signaling, morphogenesis, response and fat metabolism were also identified. At 10 days post infection, genes associated with fatty acid metabolism were identified as up regulated in expression. The value of gene set (enrichment) analyses in the analysis of microarray data is discussed.
doi:10.4137/BBI.S9954
PMCID: PMC3448498  PMID: 23012496
mouse model; microarray; gene set; host response; immunity; neospora
23.  Simultaneous Analysis of Common and Rare Variants in Complex Traits: Application to SNPs (SCARVAsnp) 
Advances in technology and reduced costs are facilitating large-scale sequencing of genes and exomes as well as entire genomes. Recently, we described an approach based on haplotypes called SCARVA1 that enables the simultaneous analysis of the association between rare and common variants in disease etiology. Here, we describe an extension of SCARVA that evaluates individual markers instead of haplotypes. This modified method (SCARVAsnp) is implemented in four stages. First, all common variants in a pre-specified region (eg, gene) are evaluated individually. Second, a union procedure is used to combined all rare variants (RVs) in the index region, and the ratio of the log likelihood with one RV excluded to the log likelihood of a model with all the collapsed RVs is calculated. On the basis of previously-reported simulation studies,1 a likelihood ratio ≥1.3 is considered statistically significant. Third, the direction of the association of the removed RV is determined by evaluating the change in λ values with the inclusion and exclusion of that RV. Lastly, significant common and rare variants, along with covariates, are included in a final regression model to evaluate the association between the trait and variants in that region. We apply simulated and real data sets to show that the method is simple to use, computationally effcient, and that it can accurately identify both common and rare risk variants. This method overcomes several limitations of existing methods. For example, SCARVAsnp limits loss of statistical power by not including variants that are not associated with the trait of interest in the final model. Also, SCARVAsnp takes into consideration the direction of association by effectively modelling positively and negatively associated variants.
doi:10.4137/BBI.S9966
PMCID: PMC3418150  PMID: 22904618
complex traits; rare and common variants
24.  An Integrated Statistical Approach to Compare Transcriptomics Data Across Experiments: A Case Study on the Identification of Candidate Target Genes of the Transcription Factor PPARα 
An effective strategy to elucidate the signal transduction cascades activated by a transcription factor is to compare the transcriptional profiles of wild type and transcription factor knockout models. Many statistical tests have been proposed for analyzing gene expression data, but most tests are based on pair-wise comparisons. Since the analysis of microarrays involves the testing of multiple hypotheses within one study, it is generally accepted that one should control for false positives by the false discovery rate (FDR). However, it has been reported that this may be an inappropriate metric for comparing data across different experiments. Here we propose an approach that addresses the above mentioned problem by the simultaneous testing and integration of the three hypotheses (contrasts) using the cell means ANOVA model. These three contrasts test for the effect of a treatment in wild type, gene knockout, and globally over all experimental groups. We illustrate our approach on microarray experiments that focused on the identification of candidate target genes and biological processes governed by the fatty acid sensing transcription factor PPARα in liver. Compared to the often applied FDR based across experiment comparison, our approach identified a conservative but less noisy set of candidate genes with same sensitivity and specificity. However, our method had the advantage of properly adjusting for multiple testing while integrating data from two experiments, and was driven by biological inference. Taken together, in this study we present a simple, yet efficient strategy to compare differential expression of genes across experiments while controlling for multiple hypothesis testing.
doi:10.4137/BBI.S9529
PMCID: PMC3388742  PMID: 22783064
false discovery rate; gene expression; transcriptomics; microarray; transcription factor; peroxisome proliferator-activated receptor alpha; ANOVA
25.  The Influence of Taxon Sampling and Tree Shape on Molecular Dating: An Empirical Example from Mammalian Mitochondrial Genomes 
Over the last decade, molecular dating methods have been among the most studied subjects in statistical phylogenetics. Although the evolutionary modelling of substitution rates and the handling of calibration information are the primary focus of species divergence time research, parameters that influence topological estimation, such as taxon sampling and tree shape, also have the potential to influence evolutionary age estimates. However, the impact of topological parameters on chronological estimates is rarely considered. In this study, we use mitochondrial genomes to evaluate the influence of tree shape and taxon sampling on the divergence times of selected nodes of the mammalian tree. Our results show that taxon sampling affects divergence time estimates; the credibility intervals for age estimates decrease as taxonomic sampling increases (i.e., estimates become more precise). The influence of taxonomic sampling was not observed on nodes that lay deep in the mammalian phylogeny, although the means of the posterior distributions tend to converge with increased taxon sampling, an effect that is independent of the location of the node. In the majority of cases, the effect of tree shape was negligible.
doi:10.4137/BBI.S9677
PMCID: PMC3370833  PMID: 22693422
mammal time scale; divergence time; relaxed molecular clock

Résultats 1-25 (108)