PMCC PMCC

Aide
Les critères de recherche

Avancée
Résultats 1-25 (41)
 

Notices sélectionnées (0)
Aucune

Sélectionner un filtre

Revues
Année de publication
Type de document
1.  Genetic and chemical knockdown: a complementary strategy for evaluating an anti-infective target 
The equity of a drug target is principally evaluated by its genetic vulnerability with tools ranging from antisense- and microRNA-driven knockdowns to induced expression of the target protein. In order to upgrade the process of antibacterial target identification and discern its most effective type of inhibition, an in silico toolbox that evaluates its genetic and chemical vulnerability leading either to stasis or cidal outcome was constructed and validated. By precise simulation and careful experimentation using enolpyruvyl shikimate-3-phosphate synthase and its specific inhibitor glyphosate, it was shown that genetic knockdown is distinct from chemical knockdown. It was also observed that depending on the particular mechanism of inhibition, viz competitive, uncompetitive, and noncompetitive, the antimicrobial potency of an inhibitor could be orders of magnitude different. Susceptibility of Escherichia coli to glyphosate and the lack of it in Mycobacterium tuberculosis could be predicted by the in silico platform. Finally, as predicted and simulated in the in silico platform, the translation of growth inhibition to a cidal effect was able to be demonstrated experimentally by altering the carbon source from sorbitol to glucose.
doi:10.2147/AABC.S39198
PMCID: PMC3572760  PMID: 23413046
knockdown; inhibition; in silico; vulnerability
2.  A novel biclustering approach with iterative optimization to analyze gene expression data 
Video abstract
Video
Objective
With the dramatic increase in microarray data, biclustering has become a promising tool for gene expression analysis. Biclustering has been proven to be superior over clustering in identifying multifunctional genes and searching for co-expressed genes under a few specific conditions; that is, a subgroup of all conditions. Biclustering based on a genetic algorithm (GA) has shown better performance than greedy algorithms, but the overlap state for biclusters must be treated more systematically.
Results
We developed a new biclustering algorithm (binary-iterative genetic algorithm [BIGA]), based on an iterative GA, by introducing a novel, ternary-digit chromosome encoding function. BIGA searches for a set of biclusters by iterative binary divisions that allow the overlap state to be explicitly considered. In addition, the average of the Pearson’s correlation coefficient was employed to measure the relationship of genes within a bicluster, instead of the mean square residual, the popular classical index. As compared to the six existing algorithms, BIGA found highly correlated biclusters, with large gene coverage and reasonable gene overlap. The gene ontology (GO) enrichment showed that most of the biclusters are significant, with at least one GO term over represented.
Conclusion
BIGA is a powerful tool to analyze large amounts of gene expression data, and will facilitate the elucidation of the underlying functional mechanisms in living organisms.
doi:10.2147/AABC.S32622
PMCID: PMC3459542  PMID: 23055751
biclustering; microarray data; genetic algorithm; Pearson’s correlation coefficient
3.  Contact-based ligand-clustering approach for the identification of active compounds in virtual screening 
Evaluation of docking results is one of the most important problems for virtual screening and in silico drug design. Modern approaches for the identification of active compounds in a large data set of docked molecules use energy scoring functions. One of the general and most significant limitations of these methods relates to inaccurate binding energy estimation, which results in false scoring of docked compounds. Automatic analysis of poses using self-organizing maps (AuPosSOM) represents an alternative approach for the evaluation of docking results based on the clustering of compounds by the similarity of their contacts with the receptor. A scoring function was developed for the identification of the active compounds in the AuPosSOM clustered dataset. In addition, the AuPosSOM efficiency for the clustering of compounds and the identification of key contacts considered as important for its activity, were also improved. Benchmark tests for several targets revealed that together with the developed scoring function, AuPosSOM represents a good alternative to the energy-based scoring functions for the evaluation of docking results.
doi:10.2147/AABC.S30881
PMCID: PMC3459543  PMID: 23055752
scoring; docking; virtual screening; CAR; AuPosSOM
4.  B-Pred, a structure based B-cell epitopes prediction server 
The ability to predict immunogenic regions in selected proteins by in-silico methods has broad implications, such as allowing a quick selection of potential reagents to be used as diagnostics, vaccines, immunotherapeutics, or research tools in several branches of biological and biotechnological research. However, the prediction of antibody target sites in proteins using computational methodologies has proven to be a highly challenging task, which is likely due to the somewhat elusive nature of B-cell epitopes. This paper proposes a web-based platform for scoring potential immunological reagents based on the structures or 3D models of the proteins of interest. The method scores a protein’s peptides set, which is derived from a sliding window, based on the average solvent exposure, with a filter on the average local model quality for each peptide. The platform was validated on a custom-assembled database of 1336 experimentally determined epitopes from 106 proteins for which a reliable 3D model could be obtained through standard modeling techniques. Despite showing poor sensitivity, this method can achieve a specificity of 0.70 and a positive predictive value of 0.29 by combining these two simple parameters. These values are slightly higher than those obtained with other established sequence-based or structure-based methods that have been evaluated using the same epitopes dataset. This method is implemented in a web server called B-Pred, which is accessible at http://immuno.bio.uniroma2.it/bpred. The server contains a number of original features that allow users to perform personalized reagent searches by manipulating the sliding window’s width and sliding step, changing the exposure and model quality thresholds, and running sequential queries with different parameters. The B-Pred server should assist experimentalists in the rational selection of epitope antigens for a wide range of applications.
doi:10.2147/AABC.S30620
PMCID: PMC3413014  PMID: 22888263
B-cell epitopes; immunoinformatics; bioinformatics; web server; epitope prediction
5.  A rapid method for combined analysis of common and rare variants at the level of a region, gene, or pathway 
Previously described methods for the combined analysis of common and rare variants have disadvantages such as requiring an arbitrary classification of variants or permutation testing to assess statistical significance. Here we propose a novel method which implements a weighting scheme based on allele frequencies observed in both cases and controls. Because the test is unbiased, scores can be analyzed with a standard t-test. To test its validity we applied it to data for common, rare, and very rare variants simulated under the null hypothesis. To test its power we applied it to simulated data in which association was present, including data using the observed allele frequencies of common and rare variants in NOD2 previously reported in cases of Crohn’s disease and controls. The method produced results that conformed well to those expected under the null hypothesis. It demonstrated more power to detect association when rare and common variants were analyzed jointly, the power further increasing when rare variants were assigned higher weights. 20,000 analyses of a gene containing 62 variants could be performed in 80 minutes on a laptop. This approach shows promise for the analysis of data currently emerging from genome wide sequencing studies.
doi:10.2147/AABC.S33049
PMCID: PMC3413013  PMID: 22888262
common; rare; variant; sequence; genome; exome
6.  Role of Shwachman-Bodian-Diamond syndrome protein in translation machinery and cell chemotaxis: a comparative genomics approach 
Shwachman-Bodian-Diamond syndrome (SBDS) is linked to a mutation in a single gene. The SBDS proinvolved in RNA metabolism and ribosome-associated functions, but SBDS mutation is primarily linked to a defect in polymorphonuclear leukocytes unable to orient correctly in a spatial gradient of chemoattractants. Results of data mining and comparative genomic approaches undertaken in this study suggest that SBDS protein is also linked to tRNA metabolism and translation initiation. Analysis of crosstalk between translation machinery and cytoskeletal dynamics provides new insights into the cellular chemotactic defects caused by SBDS protein malfunction. The proposed functional interactions provide a new approach to exploit potential targets in the treatment and monitoring of this disease.
doi:10.2147/AABC.S23510
PMCID: PMC3202468  PMID: 22046100
Shwachman-Bodian-Diamond syndrome; wybutosine; tRNA; chemotaxis; translation; genomics; gene proximity
7.  FDR-FET: an optimizing gene set enrichment analysis method 
Gene set enrichment analysis for analyzing large profiling and screening experiments can reveal unifying biological schemes based on previously accumulated knowledge represented as “gene sets”. Most of the existing implementations use a fixed fold-change or P value cutoff to generate regulated gene lists. However, the threshold selection in most cases is arbitrary, and has a significant effect on the test outcome and interpretation of the experiment. We developed a new gene set enrichment analysis method, ie, FDR-FET, which dynamically optimizes the threshold choice and improves the sensitivity and selectivity of gene set enrichment analysis. The procedure translates experimental results into a series of regulated gene lists at multiple false discovery rate (FDR) cutoffs, and computes the P value of the overrepresentation of a gene set using a Fisher’s exact test (FET) in each of these gene lists. The lowest P value is retained to represent the significance of the gene set. We also implemented improved methods to define a more relevant global reference set for the FET. We demonstrate the validity of the method using a published microarray study of three protease inhibitors of the human immunodeficiency virus and compare the results with those from other popular gene set enrichment analysis algorithms. Our results show that combining FDR with multiple cutoffs allows us to control the error while retaining genes that increase information content. We conclude that FDR-FET can selectively identify significant affected biological processes. Our method can be used for any user-generated gene list in the area of transcriptome, proteome, and other biological and scientific applications.
doi:10.2147/AABC.S15840
PMCID: PMC3169954  PMID: 21918636
gene set enrichment analysis; false discovery rate; Fisher’s exact test; microarray profiling; protease inhibitors
8.  Affinity of estrogens for human progesterone receptor A and B monomers and risk of breast cancer: a comparative molecular modeling study 
Background
The human progesterone receptor (hPR) belongs to the steroid receptor family. It may be found as monomers (A and B) and or as a dimer (AB). hPR is regarded as the prognostic biomarker for breast cancer. In a cellular dimer system, AB is the dominant species in most cases. However, when a cell coexpresses all three isoforms of hPR, the complexity of the action of this receptor increases. For example, hPR A suppresses the activity of hPR B, and the ratio of hPR A to hPR B may determine the physiology of a breast tumor. Also, persistent exposure of hPRs to nonendogenous ligands is a common risk factor for breast cancer. Hence we aimed to study progesterone and some nonendogenous ligand interactions with hPRs and their molecular docking.
Methods and results
A pool of steroid derivatives, namely, progesterone, cholesterol, testosterone, testolectone, estradiol, estrone, norethindrone, exemestane, and norgestrel, was used for this in silico study. Dockings were performed on AutoDock 4.2. We found that estrogens, including estradiol and estrone, had a higher affinity for hPR A and B monomers in comparison with the dimer, hPR AB, and that of the endogenous progesterone ligand. hPR A had a higher affinity to all the docked ligands than hPR B.
Conclusion
This study suggests that the exposure of estrogens to hPR A as well as hPR B, and more particularly to hPR A alone, is a risk factor for breast cancer.
doi:10.2147/AABC.S17371
PMCID: PMC3169952  PMID: 21918635
human progesterone receptor; breast cancer; steroid derivatives; estrogens; molecular docking
9.  LifePrint: a novel k-tuple distance method for construction of phylogenetic trees 
Purpose
Here we describe LifePrint, a sequence alignment-independent k-tuple distance method to estimate relatedness between complete genomes.
Methods
We designed a representative sample of all possible DNA tuples of length 9 (9-tuples). The final sample comprises 1878 tuples (called the LifePrint set of 9-tuples; LPS9) that are distinct from each other by at least two internal and noncontiguous nucleotide differences. For validation of our k-tuple distance method, we analyzed several real and simulated viroid genomes. Using different distance metrics, we scrutinized diverse viroid genomes to estimate the k-tuple distances between these genomic sequences. Then we used the estimated genomic k-tuple distances to construct phylogenetic trees using the neighbor-joining algorithm. A comparison of the accuracy of LPS9 and the previously reported 5-tuple method was made using symmetric differences between the trees estimated from each method and a simulated “true” phylogenetic tree.
Results
The identified optimal search scheme for LPS9 allows only up to two nucleotide differences between each 9-tuple and the scrutinized genome. Similarity search results of simulated viroid genomes indicate that, in most cases, LPS9 is able to detect single-base substitutions between genomes efficiently. Analysis of simulated genomic variants with a high proportion of base substitutions indicates that LPS9 is able to discern relationships between genomic variants with up to 40% of nucleotide substitution.
Conclusion
Our LPS9 method generates more accurate phylogenetic reconstructions than the previously proposed 5-tuples strategy. LPS9-reconstructed trees show higher bootstrap proportion values than distance trees derived from the 5-tuple method.
doi:10.2147/AABC.S15021
PMCID: PMC3169951  PMID: 21918634
phylogeny; sequence alignment; similarity search; tuple; viroid
10.  MODENA: a multi-objective RNA inverse folding 
Artificially synthesized RNA molecules have recently come under study since such molecules have a potential for creating a variety of novel functional molecules. When designing artificial RNA sequences, secondary structure should be taken into account since functions of noncoding RNAs strongly depend on their structure. RNA inverse folding is a methodology for computationally exploring the RNA sequences folding into a user-given target structure. In the present study, we developed a multi-objective genetic algorithm, MODENA (Multi-Objective DEsign of Nucleic Acids), for RNA inverse folding. MODENA explores the approximate set of weak Pareto optimal solutions in the objective function space of 2 objective functions, a structure stability score and structure similarity score. MODENA can simultaneously design multiple different RNA sequences at 1 run, whose lowest free energies range from a very stable value to a higher value near those of natural counterparts. MODENA and previous RNA inverse folding programs were benchmarked with 29 target structures taken from the Rfam database, and we found that MODENA can successfully design 23 RNA sequences folding into the target structures; this result is better than those of the other benchmarked RNA inverse folding programs. The multi-objective genetic algorithm gives a useful framework for a functional biomolecular design. Executable files of MODENA can be obtained at http://rna.eit.hirosaki-u.ac.jp/modena/.
PMCID: PMC3169953  PMID: 21918633
multi-objective genetic algorithm; secondary structure; RNA sequence design; Rfam
11.  SNP analysis of follistatin gene associated with polycystic ovarian syndrome 
Follistatin has been reported as a candidate gene for polycystic ovarian syndrome (PCOS) based on linkage and association studies. In this study, investigation of polymorphisms in the FST gene was done to determine if genetic variation is associated with susceptibility to PCOS. The nucleotide sequence of human follistatin and the protein sequence of human follistatin were retrieved from the NCBI database using Entrez. The follistatin protein of human was retrieved from the Swiss-Prot database. There are 344 amino acids and the molecular weight is 38,007 Da. The ProtParam analysis shows that the isoelectric point is 5.53 and the aliphatic index is 61.25. The hydropathicity is −0.490. The domains in FST protein are as follows: Pfam-B 5005 domain from 1 to 92; EGF-like subdomain from 93 to 116; Kazal 1 domain, occurred in three places, namely, 118–164, 192–239, and 270–316. There are 31 single-nucleotide polymorphisms (SNPs) for this gene. Some are nonsynonymous, some occur in the intron region, and some in an untranslated region. Two nonsynonymous SNPs, namely, rs11745088 and rs1127760, were taken for analysis. In the SNP rs11745088, the change is E152Q. Likewise, in rs1127760, the change is C239S. SIFT (Sorting Intolerant from Tolerant) showed positions of amino acids and the single letter code of amino acids that can be tolerated or deleterious for each position. There were six SNP results and each result had links to it. The dbSNP id, primary database id, and the type of mutation whether silent and if occurring in coding region are given as phenotype alterations. The FASTA format of protein was given to the nsSNP Analyzer tool, and the variation E152Q and C239S were given as inputs in the SNP data field. E152Q change was neutral and C239S causes disease. Using PANTHER for evolutionary analysis of coding SNPs, the protein sequence was given as input and analyzed for the E152Q and C239S SNPs for deleterious effect on protein function. The genetic association database results showed that FST gene SNPs are linked to PCOS coming under the disease class of metabolic disorders. The list of intronic and synonymous SNPs, with their nucleotide position, amino acid change information, and dbSNP link, is provided for further analysis.
doi:10.2147/AABC.S11013
PMCID: PMC3170008  PMID: 21918632
FST; polycystic ovarian syndrome; single-nucleotide polymorphism analysis
12.  A kinetic platform for in silico modeling of the metabolic dynamics in Escherichia coli 
Background
A prerequisite for a successful design and discovery of an antibacterial drug is the identification of essential targets as well as potent inhibitors that adversely affect the survival of bacteria. In order to understand how intracellular perturbations occur due to inhibition of essential metabolic pathways, we have built, through the use of ordinary differential equations, a mathematical model of 8 major Escherichia coli pathways.
Results
Individual in vitro enzyme kinetic parameters published in the literature were used to build the network of pathways in such a way that the flux distribution matched that reported from whole cells. Gene regulation at the transcription level as well as feedback regulation of enzyme activity was incorporated as reported in the literature. The unknown kinetic parameters were estimated by trial and error through simulations by observing network stability. Metabolites, whose biosynthetic pathways were not represented in this platform, were provided at a fixed concentration. Unutilized products were maintained at a fixed concentration by removing excess quantities from the platform. This approach enabled us to achieve steady state levels of all the metabolites in the cell. The output of various simulations correlated well with those previously published.
Conclusion
Such a virtual platform can be exploited for target identification through assessment of their vulnerability, desirable mode of target enzyme inhibition, and metabolite profiling to ascribe mechanism of action following a specific target inhibition. Vulnerability of targets in the biosynthetic pathway of coenzyme A was evaluated using this platform. In addition, we also report the utility of this platform in understanding the impact of a physiologically relevant carbon source, glucose versus acetate, on metabolite profiles of bacterial pathogens.
doi:10.2147/AABC.S14368
PMCID: PMC3170011  PMID: 21918631
antibacterial drug; mathematical model; kinetic platform; metabolic dynamics; Escherichia coli
13.  Construction of random perfect phylogeny matrix 
Purpose
Interest in developing methods appropriate for mapping increasing amounts of genome-wide molecular data are increasing rapidly. There is also an increasing need for methods that are able to efficiently simulate such data.
Patients and methods
In this article, we provide a graph-theory approach to find the necessary and sufficient conditions for the existence of a phylogeny matrix with k nonidentical haplotypes, n single nucleotide polymorphisms (SNPs), and a population size of m for which the minimum allele frequency of each SNP is between two specific numbers a and b.
Results
We introduce an O(max(n2, nm)) algorithm for the random construction of such a phylogeny matrix. The running time of any algorithm for solving this problem would be Ω (nm).
Conclusion
We have developed software, RAPPER, based on this algorithm, which is available at http://bioinf.cs.ipm.ir/softwares/RAPPER.
doi:10.2147/AABC.S13397
PMCID: PMC3170006  PMID: 21918630
perfect phylogeny; minimum allele frequency (MAF); tree; recursive algorithm
14.  Efficient algorithms for multidimensional global optimization in genetic mapping of complex traits 
We present a two-phase strategy for optimizing a multidimensional, nonconvex function arising during genetic mapping of quantitative traits. Such traits are believed to be affected by multiple so called quantitative trait loci (QTL), and searching for d QTL results in a d-dimensional optimization problem with a large number of local optima. We combine the global algorithm DIRECT with a number of local optimization methods that accelerate the final convergence, and adapt the algorithms to problem-specific features. We also improve the evaluation of the QTL mapping objective function to enable exploitation of the smoothness properties of the optimization landscape. Our best two-phase method is demonstrated to be accurate in at least six dimensions and up to ten times faster than currently used QTL mapping algorithms.
doi:10.2147/AABC.S9240
PMCID: PMC3170002  PMID: 21918629
global optimization; QTL mapping; DIRECT
15.  An unsupervised strategy for biomedical image segmentation 
Many segmentation techniques have been published, and some of them have been widely used in different application problems. Most of these segmentation techniques have been motivated by specific application purposes. Unsupervised methods, which do not assume any prior scene knowledge can be learned to help the segmentation process, and are obviously more challenging than the supervised ones. In this paper, we present an unsupervised strategy for biomedical image segmentation using an algorithm based on recursively applying mean shift filtering, where entropy is used as a stopping criterion. This strategy is proven with many real images, and a comparison is carried out with manual segmentation. With the proposed strategy, errors less than 20% for false positives and 0% for false negatives are obtained.
doi:10.2147/AABC.S11918
PMCID: PMC3170003  PMID: 21918628
segmentation; mean shift; unsupervised segmentation; entropy
16.  Modeling of thermodynamic and physico-chemical properties of coumarins bioactivity against Candida albicans using a Levenberg–Marquardt neural network 
In recent years, due to vital need for novel fungicidal agents, investigation on natural antifungal resources has been increased. The special features exhibited by neural network classifiers make them suitable for handling complex problems like analyzing different properties of candidate compounds in computer-aided drug design. In this study, by using a Levenberg–Marquardt (LM) neural network (the fastest of the training algorithms), the relation between some important thermodynamic and physico-chemical properties of coumarin compounds and their biological activities (tested against Candida albicans) has been evaluated. A set of already reported antifungal bioactive coumarin and some well-known physical descriptors have been selected and using LM training algorithm the best architecture of neural model has been designed for forecasting the new bioactive compounds.
PMCID: PMC3170013  PMID: 21918627
Levenberg/Marquardt algorithm; coumarin; neural network
17.  Molecular biocoding of insulin 
This paper discusses cyberinformation studies of the amino acid composition of insulin, in particular the identification of scientific terminology that could describe this phenomenon, ie, the study of genetic information, as well as the relationship between the genetic language of proteins and theoretical aspects of this system and cybernetics. The results of this research show that there is a matrix code for insulin. It also shows that the coding system within the amino acid language gives detailed information, not only on the amino acid “record”, but also on its structure, configuration, and various shapes. The issue of the existence of an insulin code and coding of the individual structural elements of this protein are discussed. Answers to the following questions are sought. Does the matrix mechanism for biosynthesis of this protein function within the law of the general theory of information systems, and what is the significance of this for understanding the genetic language of insulin? What is the essence of existence and functioning of this language? Is the genetic information characterized only by biochemical principles or it is also characterized by cyberinformation principles? The potential effects of physical and chemical, as well as cybernetic and information principles, on the biochemical basis of insulin are also investigated. This paper discusses new methods for developing genetic technologies, in particular more advanced digital technology based on programming, cybernetics, and informational laws and systems, and how this new technology could be useful in medicine, bioinformatics, genetics, biochemistry, and other natural sciences.
PMCID: PMC3170004  PMID: 21918626
human insulin; insulin model; biocode; genetic code; amino acids
18.  Pharmacogenomics of drug efficacy in the interferon treatment of chronic hepatitis C using classification algorithms 
Chronic hepatitis C (CHC) patients often stop pursuing interferon-alfa and ribavirin (IFN-alfa/RBV) treatment because of the high cost and associated adverse effects. It is highly desirable, both clinically and economically, to establish tools to distinguish responders from nonresponders and to predict possible outcomes of the IFN-alfa/RBV treatments. Single nucleotide polymorphisms (SNPs) can be used to understand the relationship between genetic inheritance and IFN-alfa/RBV therapeutic response. The aim in this study was to establish a predictive model based on a pharmacogenomic approach. Our study population comprised Taiwanese patients with CHC who were recruited from multiple sites in Taiwan. The genotyping data was generated in the high-throughput genomics lab of Vita Genomics, Inc. With the wrapper-based feature selection approach, we employed multilayer feedforward neural network (MFNN) and logistic regression as a basis for comparisons. Our data revealed that the MFNN models were superior to the logistic regression model. The MFNN approach provides an efficient way to develop a tool for distinguishing responders from nonresponders prior to treatments. Our preliminary results demonstrated that the MFNN algorithm is effective for deriving models for pharmacogenomics studies and for providing the link from clinical factors such as SNPs to the responsiveness of IFN-alfa/RBV in clinical association studies in pharmacogenomics.
PMCID: PMC3170005  PMID: 21918625
chronic hepatitis C; artificial neural networks; interferon; pharmacogenomics; ribavirin; single nucleotide polymorphisms
19.  Simultaneous use of solution, solid-state NMR and X-ray crystallography to study the conformational landscape of the Crh protein during oligomerization and crystallization 
We explore, using the Crh protein dimer as a model, how information from solution NMR, solid-state NMR and X-ray crystallography can be combined using structural bioinformatics methods, in order to get insights into the transition from solution to crystal. Using solid-state NMR chemical shifts, we filtered intra-monomer NMR distance restraints in order to keep only the restraints valid in the solid state. These filtered restraints were added to solid-state NMR restraints recorded on the dimer state to sample the conformational landscape explored during the oligomerization process. The use of non-crystallographic symmetries then permitted the extraction of converged conformers subsets. Ensembles of NMR and crystallographic conformers calculated independently display similar variability in monomer orientation, which supports a funnel shape for the conformational space explored during the solution-crystal transition. Insights into alternative conformations possibly sampled during oligomerization were obtained by analyzing the relative orientation of the two monomers, according to the restraint precision. Molecular dynamics simulations of Crh confirmed the tendencies observed in NMR conformers, as a paradoxical increase of the distance between the two β1a strands, when the structure gets closer to the crystallographic structure, and the role of water bridges in this context.
PMCID: PMC3170007  PMID: 21918624
structural bioinformatics; NMR structure calculation; ARIA; non-crystallographic symmetry; crystallographic ensemble refinement; molecular dynamics simulation
20.  Insights into the classification of small GTPases 
In this study we used a Random Forest-based approach for an assignment of small guanosine triphosphate proteins (GTPases) to specific subgroups. Small GTPases represent an important functional group of proteins that serve as molecular switches in a wide range of fundamental cellular processes, including intracellular transport, movement and signaling events. These proteins have further gained a special emphasis in cancer research, because within the last decades a huge variety of small GTPases from different subgroups could be related to the development of all types of tumors. Using a random forest approach, we were able to identify the most important amino acid positions for the classification process within the small GTPases superfamily and its subgroups. These positions are in line with the results of earlier studies and have been shown to be the essential elements for the different functionalities of the GTPase families. Furthermore, we provide an accurate and reliable software tool (GTPasePred) to identify potential novel GTPases and demonstrate its application to genome sequences.
PMCID: PMC3170009  PMID: 21918623
cancer; machine learning; classification; Random Forests; proteins
21.  Predicting recurrent aphthous ulceration using genetic algorithms-optimized neural networks 
Objective
To construct and optimize a neural network that is capable of predicting the occurrence of recurrent aphthous ulceration (RAU) based on a set of appropriate input data.
Participants and methods
Artificial neural networks (ANN) software employing genetic algorithms to optimize the architecture neural networks was used. Input and output data of 86 participants (predisposing factors and status of the participants with regards to recurrent aphthous ulceration) were used to construct and train the neural networks. The optimized neural networks were then tested using untrained data of a further 10 participants.
Results
The optimized neural network, which produced the most accurate predictions for the presence or absence of recurrent aphthous ulceration was found to employ: gender, hematological (with or without ferritin) and mycological data of the participants, frequency of tooth brushing, and consumption of vegetables and fruits.
Conclusions
Factors appearing to be related to recurrent aphthous ulceration and appropriate for use as input data to construct ANNs that predict recurrent aphthous ulceration were found to include the following: gender, hemoglobin, serum vitamin B12, serum ferritin, red cell folate, salivary candidal colony count, frequency of tooth brushing, and the number of fruits or vegetables consumed daily.
PMCID: PMC3170012  PMID: 21918622
artifical neural networks; recurrent; aphthous ulceration; ulcer
22.  Estimating affinities of calcium ions to proteins 
Ca2+-ions have a range of affinities to different proteins, depending on the various functions of these proteins. This makes the determination of Ca2+-protein affinities an interesting subject for functional studies. We have investigated the performance of two methods – Fold-X and AutoDock vina – in the prediction of Ca2+-protein affinities. Both methods, although based on different energy functions, showed virtually the same correlation with experimental affinities. Guided by insight from experiment, we further derived a simple linear model based on the solvent accessible surface of Ca2+ that had practically the same performance in terms of absolute errors as the more complex docking methods.
PMCID: PMC3170010  PMID: 21918621
metal ions; binding; free energy; crystal structure; solvent accessible surface
23.  Logical network of genotoxic stress-induced NF-κB signal transduction predicts putative target structures for therapeutic intervention strategies 
Genotoxic stress is induced by a broad range of DNA-damaging agents and could lead to a variety of human diseases including cancer. DNA damage is also therapeutically induced for cancer treatment with the aim to eliminate tumor cells. However, the effectiveness of radio- and chemotherapy is strongly hampered by tumor cell resistance. A major reason for radio- and chemotherapeutic resistances is the simultaneous activation of cell survival pathways resulting in the activation of the transcription factor nuclear factor-kappa B (NF-κB). Here, we present a Boolean network model of the NF-κB signal transduction induced by genotoxic stress in epithelial cells. For the representation and analysis of the model, we used the formalism of logical interaction hypergraphs. Model reconstruction was based on a careful meta-analysis of published data. By calculating minimal intervention sets, we identified p53-induced protein with a death domain (PIDD), receptor-interacting protein 1 (RIP1), and protein inhibitor of activated STAT y (PIASy) as putative therapeutic targets to abrogate NF-κB activation resulting in apoptosis. Targeting these structures therapeutically may potentiate the effectiveness of radio-and chemotherapy. Thus, the presented model allows a better understanding of the signal transduction in tumor cells and provides candidates as new therapeutic target structures.
PMCID: PMC3169943  PMID: 21918620
apoptosis; Boolean network; cancer therapy; DNA-damage response; NF-κB
24.  Computer applications for prediction of protein–protein interactions and rational drug design 
In recent years, protein–protein interactions are becoming the object of increasing attention in many different fields, such as structural biology, molecular biology, systems biology, and drug discovery. From a structural biology perspective, it would be desirable to integrate current efforts into the structural proteomics programs. Given that experimental determination of many protein–protein complex structures is highly challenging, and in the context of current high-performance computational capabilities, different computer tools are being developed to help in this task. Among them, computational docking aims to predict the structure of a protein–protein complex starting from the atomic coordinates of its individual components, and in recent years, a growing number of docking approaches are being reported with increased predictive capabilities. The improvement of speed and accuracy of these docking methods, together with the modeling of the interaction networks that regulate the most critical processes in a living organism, will be essential for computational proteomics. The ultimate goal is the rational design of drugs capable of specifically inhibiting or modifying protein–protein interactions of therapeutic significance. While rational design of protein–protein interaction inhibitors is at its very early stage, the first results are promising.
PMCID: PMC3169948  PMID: 21918619
protein-protein interactions; drug design; protein docking; structural prediction; virtual ligand screening; hot-spots
25.  Classification of heterodimer interfaces using docking models and construction of scoring functions for the complex structure prediction 
Protein–protein docking simulations can provide the predicted complex structural models. In a docking simulation, several putative structural models are selected by scoring functions from an ensemble of many complex models. Scoring functions based on statistical analyses of heterodimers are usually designed to select the complex model with the most abundant interaction mode found among the known complexes, as the correct model. However, because the formation schemes of heterodimers are extremely diverse, a single scoring function does not seem to be sufficient to describe the fitness of the predicted models other than the most abundant interaction mode. Thus, it is necessary to classify the heterodimers in terms of their individual interaction modes, and then to construct multiple scoring functions for each heterodimer type. In this study, we constructed the classification method of heterodimers based on the discriminative characters between near-native and decoy models, which were found in the comparison of the interfaces in terms of the complementarities for the hydrophobicity, the electrostatic potential and the shape. Consequently, we found four heterodimer clusters, and then constructed the multiple scoring functions, each of which was optimized for each cluster. Our multiple scoring functions were applied to the predictions in the unbound docking.
PMCID: PMC3169947  PMID: 21918618
classification of heterodimers; prediction of complex structures; scoring functions; protein-protein docking; CAPRI

Résultats 1-25 (41)