One approach to experimental science involves creating hypotheses, then testing them by varying one or more independent variables, and assessing the effects of this variation on the processes of interest. We use this strategy to compare the intellectual status and available evidence for two models or views of mechanisms of transmembrane drug transport into intact biological cells. One (BDII) asserts that lipoidal phospholipid Bilayer Diffusion Is Important, while a second (PBIN) proposes that in normal intact cells Phospholipid Bilayer diffusion Is Negligible (i.e., may be neglected quantitatively), because evolution selected against it, and with transmembrane drug transport being effected by genetically encoded proteinaceous carriers or pores, whose “natural” biological roles, and substrates are based in intermediary metabolism. Despite a recent review elsewhere, we can find no evidence able to support BDII as we can find no experiments in intact cells in which phospholipid bilayer diffusion was either varied independently or measured directly (although there are many papers where it was inferred by seeing a covariation of other dependent variables). By contrast, we find an abundance of evidence showing cases in which changes in the activities of named and genetically identified transporters led to measurable changes in the rate or extent of drug uptake. PBIN also has considerable predictive power, and accounts readily for the large differences in drug uptake between tissues, cells and species, in accounting for the metabolite-likeness of marketed drugs, in pharmacogenomics, and in providing a straightforward explanation for the late-stage appearance of toxicity and of lack of efficacy during drug discovery programmes despite macroscopically adequate pharmacokinetics. Consequently, the view that Phospholipid Bilayer diffusion Is Negligible (PBIN) provides a starting hypothesis for assessing cellular drug uptake that is much better supported by the available evidence, and is both more productive and more predictive.
drug transporters; systems pharmacology; pharmacogenomics; Recon2
A major trend in recent Parkinson's disease (PD) research is the investigation of biological markers that could help in identifying at-risk individuals or to track disease progression and response to therapies. Central to this is the knowledge that inflammation is a known hallmark of PD and of many other degenerative diseases. In the current work, we focus on inflammatory signalling in PD, using a systems approach that allows us to look at the disease in a more holistic way. We discuss cyclooxygenases, prostaglandins, thromboxanes and also iron in PD. These particular signalling molecules are involved in PD pathophysiology, but are also very important in an aberrant coagulation/hematology system. We present and discuss a hypothesis regarding the possible interaction of these aberrant signalling molecules implicated in PD, and suggest that these molecules may affect the erythrocytes of PD patients. This would be observable as changes in the morphology of the RBCs and of PD patients relative to healthy controls. We then show that the RBCs of PD patients are indeed rather dramatically deranged in their morphology, exhibiting eryptosis (a kind of programmed cell death). This morphological indicator may have useful diagnostic and prognostic significance.
Parkinson's disease; hypercoagulability; erythrocytes; eryptosis
We rehearse the processes of innovation and discovery in general terms, using as our main metaphor the biological concept of an evolutionary fitness landscape. Incremental and disruptive innovations are seen, respectively, as successful searches carried out locally or more widely. They may also be understood as reflecting evolution by mutation (incremental) versus recombination (disruptive). We also bring a platonic view, focusing on virtue and memory. We use ‘virtue’ as a measure of efforts, including the knowledge required to come up with disruptive and incremental innovations, and ‘memory’ as a measure of their lifespan, i.e. how long they are remembered. Fostering innovation, in the evolutionary metaphor, means providing the wherewithal to promote novelty, good objective functions that one is trying to optimize, and means to improve one's knowledge of, and ability to navigate, the landscape one is searching. Recombination necessarily implies multi- or inter-disciplinarity. These principles are generic to all kinds of creativity, novel ideas formation and the development of new products and technologies.
innovation; evolutionary computing; philosophy of science
Introduction: Unliganded iron both contributes to the pathology of Alzheimer's disease (AD) and also changes the morphology of erythrocytes (RBCs). We tested the hypothesis that these two facts might be linked, i.e., that the RBCs of AD individuals have a variant morphology, that might have diagnostic or prognostic value.
Methods: We included a literature survey of AD and its relationships to the vascular system, followed by a laboratory study. Four different microscopy techniques were used and results statistically compared to analyze trends between high and normal serum ferritin (SF) AD individuals.
Results: Light and scanning electron microscopies showed little difference between the morphologies of RBCs taken from healthy individuals and from normal SF AD individuals. By contrast, there were substantial changes in the morphology of RBCs taken from high SF AD individuals. These differences were also observed using confocal microscopy and as a significantly greater membrane stiffness (measured using force-distance curves).
Conclusion: We argue that high ferritin levels may contribute to an accelerated pathology in AD. Our findings reinforce the importance of (unliganded) iron in AD, and suggest the possibility both of an early diagnosis and some means of treating or slowing down the progress of this disease.
Alzheimer's disease; erythrocytes; iron; scanning electron microscopy; atomic force microscopy
Blood-vessel dysfunction arises before overt hyperglycemia in type-2 diabetes (T2DM). We hypothesised that a metabolomic approach might identify metabolites/pathways perturbed in this pre-hyperglycemic phase. To test this hypothesis and for specific metabolite hypothesis generation, serum metabolic profiling was performed in young women at increased, intermediate and low risk of subsequent T2DM.
Participants were stratified by glucose tolerance during a previous index pregnancy into three risk-groups: overt gestational diabetes (GDM; n = 18); those with glucose values in the upper quartile but below GDM levels (UQ group; n = 45); and controls (n = 43, below the median glucose values). Follow-up serum samples were collected at a mean 22 months postnatally. Samples were analysed in a random order using Ultra Performance Liquid Chromatography coupled to an electrospray hybrid LTQ-Orbitrap mass spectrometer. Statistical analysis included principal component (PCA) and multivariate methods.
Significant between-group differences were observed at follow-up in waist circumference (86, 95%CI (79–91) vs 80 (76–84) cm for GDM vs controls, p<0.05), adiponectin (about 33% lower in GDM group, p = 0.004), fasting glucose, post-prandial glucose and HbA1c, but the latter 3 all remained within the ‘normal’ range. Substantial differences in metabolite profiles were apparent between the 2 ‘at-risk’ groups and controls, particularly in concentrations of phospholipids (4 metabolites with p≤0.01), acylcarnitines (3 with p≤0.02), short- and long-chain fatty acids (3 with p< = 0.03), and diglycerides (4 with p≤0.05).
Defects in adipocyte function from excess energy storage as relatively hypoxic visceral and hepatic fat, and impaired mitochondrial fatty acid oxidation may initiate the observed perturbations in lipid metabolism. Together with evidence from the failure of glucose-directed treatments to improve cardiovascular outcomes, these data and those of others indicate that a new, quite different definition of type-2 diabetes is required. This definition would incorporate disturbed lipid metabolism prior to hyperglycemia.
The de novo synthesis of genes is becoming increasingly common in synthetic biology studies. However, the inherent error rate (introduced by errors incurred during oligonucleotide synthesis) limits its use in synthesising protein libraries to only short genes. Here we introduce SpeedyGenes, a PCR-based method for the synthesis of diverse protein libraries that includes an error-correction procedure, enabling the efficient synthesis of large genes for use directly in functional screening. First, we demonstrate an accurate gene synthesis method by synthesising and directly screening (without pre-selection) a 747 bp gene for green fluorescent protein (yielding 85% fluorescent colonies) and a larger 1518 bp gene (a monoamine oxidase, producing 76% colonies with full catalytic activity, a 4-fold improvement over previous methods). Secondly, we show that SpeedyGenes can accommodate multiple and combinatorial variant sequences while maintaining efficient enzymatic error correction, which is particularly crucial for larger genes. In its first application for directed evolution, we demonstrate the use of SpeedyGenes in the synthesis and screening of large libraries of MAO-N variants. Using this method, libraries are synthesised, transformed and screened within 3 days. Importantly, as each mutation we introduce is controlled by the oligonucleotide sequence, SpeedyGenes enables the synthesis of large, diverse, yet controlled variant sequences for the purposes of directed evolution.
directed evolution; error correction; gene synthesis; protein libraries
Genomic data now allow the large-scale manual or semi-automated reconstruction of metabolic networks. A network reconstruction represents a highly curated organism-specific knowledge base. A few genome-scale network reconstructions have appeared for metabolism in the baker’s yeast Saccharomyces cerevisiae. These alternative network reconstructions differ in scope and content, and further have used different terminologies to describe the same chemical entities, thus making comparisons between them difficult. The formulation of a ‘community consensus’ network that collects and formalizes the ‘community knowledge’ of yeast metabolism is thus highly desirable. We describe how we have produced a consensus metabolic network reconstruction for S. cerevisiae. Special emphasis is laid on referencing molecules to persistent databases or using database-independent forms such as SMILES or InChI strings, since this permits their chemical structure to be represented unambiguously and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language, and we describe the manner in which it can be maintained as a community resource. It should serve as a common denominator for system biology studies of yeast. Similar strategies will be of benefit to communities studying genome-scale metabolic networks of other organisms.
GeneGenie, a new online tool available at http://www.gene-genie.org, is introduced to support the design and self-assembly of synthetic genes and constructs. GeneGenie allows for the design of oligonucleotide cohorts encoding the gene sequence optimized for expression in any suitable host through an intuitive, easy-to-use web interface. The tool ensures consistent oligomer overlapping melting temperatures, minimizes the likelihood of misannealing, optimizes codon usage for expression in a selected host, allows for specification of forward and reverse cloning sequences (for downstream ligation) and also provides support for mutagenesis or directed evolution studies. Directed evolution studies are enabled through the construction of variant libraries via the optional specification of ‘variant codons’, containing mixtures of bases, at any position. For example, specifying the variant codon TNT (where N is any nucleotide) will generate an equimolar mixture of the codons TAT, TCT, TGT and TTT at that position, encoding a mixture of the amino acids Tyr, Ser, Cys and Phe. This facility is demonstrated through the use of GeneGenie to develop and synthesize a library of enhanced green fluorescent protein variants.
•We now have metabolic network models; the metabolome is represented by their nodes.•Metabolite levels are sensitive to changes in enzyme activities.•Drugs hitchhike on metabolite transporters to get into and out of cells.•The consensus network Recon2 represents the present state of the art, and has predictive power.•Constraint-based modelling relates network structure to metabolic fluxes.
Metabolism represents the ‘sharp end’ of systems biology, because changes in metabolite concentrations are necessarily amplified relative to changes in the transcriptome, proteome and enzyme activities, which can be modulated by drugs. To understand such behaviour, we therefore need (and increasingly have) reliable consensus (community) models of the human metabolic network that include the important transporters. Small molecule ‘drug’ transporters are in fact metabolite transporters, because drugs bear structural similarities to metabolites known from the network reconstructions and from measurements of the metabolome. Recon2 represents the present state-of-the-art human metabolic network reconstruction; it can predict inter alia: (i) the effects of inborn errors of metabolism; (ii) which metabolites are exometabolites, and (iii) how metabolism varies between tissues and cellular compartments. However, even these qualitative network models are not yet complete. As our understanding improves so do we recognise more clearly the need for a systems (poly)pharmacology.
It is well-known that individuals with increased iron levels are more prone to thrombotic diseases, mainly due to the presence of unliganded iron, and thereby the increased production of hydroxyl radicals. It is also known that erythrocytes (RBCs) may play an important role during thrombotic events. Therefore the purpose of the current study was to assess whether RBCs had an altered morphology in individuals with hereditary hemochromatosis (HH), as well as some who displayed hyperferritinemia (HF). Using scanning electron microscopy, we also assessed means by which the RBC and fibrin morphology might be normalized. An important objective was to test the hypothesis that the altered RBC morphology was due to the presence of excess unliganded iron by removing it through chelation. Very striking differences were observed, in that the erythrocytes from HH and HF individuals were distorted and had a much greater axial ratio compared to that accompanying the discoid appearance seen in the normal samples. The response to thrombin, and the appearance of a platelet-rich plasma smear, were also markedly different. These differences could largely be reversed by the iron chelator desferal and to some degree by the iron chelator clioquinol, or by the free radical trapping agents salicylate or selenite (that may themselves also be iron chelators). These findings are consistent with the view that the aberrant morphology of the HH and HF erythrocytes is caused, at least in part, by unliganded (‘free’) iron, whether derived directly via raised ferritin levels or otherwise, and that lowering it or affecting the consequences of its action may be of therapeutic benefit. The findings also bear on the question of the extent to which accepting blood donations from HH individuals may be desirable or otherwise.
Mapping the landscape of possible macromolecular polymer sequences to their fitness in performing biological functions is a challenge across the biosciences. A paradigm is the case of aptamers, nucleic acids that can be selected to bind particular target molecules. We have characterized the sequence-fitness landscape for aptamers binding allophycocyanin (APC) protein via a novel Closed Loop Aptameric Directed Evolution (CLADE) approach. In contrast to the conventional SELEX methodology, selection and mutation of aptamer sequences was carried out in silico, with explicit fitness assays for 44 131 aptamers of known sequence using DNA microarrays in vitro. We capture the landscape using a predictive machine learning model linking sequence features and function and validate this model using 5500 entirely separate test sequences, which give a very high observed versus predicted correlation of 0.87. This approach reveals a complex sequence-fitness mapping, and hypotheses for the physical basis of aptameric binding; it also enables rapid design of novel aptamers with desired binding properties. We demonstrate an extension to the approach by incorporating prior knowledge into CLADE, resulting in some of the tightest binding sequences.
Multiple models of human metabolism have been reconstructed, but each represents only a subset of our knowledge. Here we describe Recon 2, a community-driven, consensus ‘metabolic reconstruction’, which is the most comprehensive representation of human metabolism that is applicable to computational modeling. Compared with its predecessors, the reconstruction has improved topological and functional features, including ~2× more reactions and ~1.7× more unique metabolites. Using Recon 2 we predicted changes in metabolite biomarkers for 49 inborn errors of metabolism with 77% accuracy when compared to experimental data. Mapping metabolomic data and drug information onto Recon 2 demonstrates its potential for integrating and analyzing diverse data types. Using protein expression data, we automatically generated a compendium of 65 cell type–specific models, providing a basis for manual curation or investigation of cell-specific metabolic properties. Recon 2 will facilitate many future biomedical studies and is freely available at http://humanmetabolism.org/.
Systems biology projects and omics technologies have led to a growing number of biochemical pathway models and reconstructions. However, the majority of these models are still created de novo, based on literature mining and the manual processing of pathway data.
To increase the efficiency of model creation, the Path2Models project has automatically generated mathematical models from pathway representations using a suite of freely available software. Data sources include KEGG, BioCarta, MetaCyc and SABIO-RK. Depending on the source data, three types of models are provided: kinetic, logical and constraint-based. Models from over 2 600 organisms are encoded consistently in SBML, and are made freely available through BioModels Database at http://www.ebi.ac.uk/biomodels-main/path2models. Each model contains the list of participants, their interactions, the relevant mathematical constructs, and initial parameter values. Most models are also available as easy-to-understand graphical SBGN maps.
To date, the project has resulted in more than 140 000 freely available models. Such a resource can tremendously accelerate the development of mathematical models by providing initial starting models for simulation and analysis, which can be subsequently curated and further parameterized.
Modular rate law; Constraint based models; Logical models; SBGN; SBML
We present an experimental and computational pipeline for the generation of kinetic models of metabolism, and demonstrate its application to glycolysis in Saccharomyces cerevisiae. Starting from an approximate mathematical model, we employ a “cycle of knowledge” strategy, identifying the steps with most control over flux. Kinetic parameters of the individual isoenzymes within these steps are measured experimentally under a standardised set of conditions. Experimental strategies are applied to establish a set of in vivo concentrations for isoenzymes and metabolites. The data are integrated into a mathematical model that is used to predict a new set of metabolite concentrations and reevaluate the control properties of the system. This bottom-up modelling study reveals that control over the metabolic network most directly involved in yeast glycolysis is more widely distributed than previously thought.
Glycolysis; Systems biology; Enzyme kinetic; Isoenzyme; Modelling
Following a strategy similar to that used in baker’s yeast (Herrgård et al. Nat Biotechnol 26:1155–1160, 2008). A consensus yeast metabolic network obtained from a community approach to systems biology (Herrgård et al. 2008; Dobson et al. BMC Syst Biol 4:145, 2010). Further developments towards a genome-scale metabolic model of yeast (Dobson et al. 2010; Heavner et al. BMC Syst Biol 6:55, 2012). Yeast 5—an expanded reconstruction of the Saccharomyces cerevisiae metabolic network (Heavner et al. 2012) and in Salmonella typhimurium (Thiele et al. BMC Syst Biol 5:8, 2011). A community effort towards a knowledge-base and mathematical model of the human pathogen Salmonellatyphimurium LT2 (Thiele et al. 2011), a recent paper (Thiele et al. Nat Biotechnol 31:419–425, 2013). A community-driven global reconstruction of human metabolism (Thiele et al. 2013) described a much improved ‘community consensus’ reconstruction of the human metabolic network, called Recon 2, and the authors (that include the present ones) have made it freely available via a database at http://humanmetabolism.org/ and in SBML format at Biomodels (http://identifiers.org/biomodels.db/MODEL1109130000). This short analysis summarises the main findings, and suggests some approaches that will be able to exploit the availability of this model to advantage.
Metabolism; Modelling; Systems biology; Networks; Metabolic networks
Motivation: To create, verify and maintain pathway models, curators must discover and assess knowledge distributed over the vast body of biological literature. Methods supporting these tasks must understand both the pathway model representations and the natural language in the literature. These methods should identify and order documents by relevance to any given pathway reaction. No existing system has addressed all aspects of this challenge.
Method: We present novel methods for associating pathway model reactions with relevant publications. Our approach extracts the reactions directly from the models and then turns them into queries for three text mining-based MEDLINE literature search systems. These queries are executed, and the resulting documents are combined and ranked according to their relevance to the reactions of interest. We manually annotate document-reaction pairs with the relevance of the document to the reaction and use this annotation to study several ranking methods, using various heuristic and machine-learning approaches.
Results: Our evaluation shows that the annotated document-reaction pairs can be used to create a rule-based document ranking system, and that machine learning can be used to rank documents by their relevance to pathway reactions. We find that a Support Vector Machine-based system outperforms several baselines and matches the performance of the rule-based system. The success of the query extraction and ranking methods are used to update our existing pathway search system, PathText.
Availability: An online demonstration of PathText 2 and the annotated corpus are available for research purposes at http://www.nactem.ac.uk/pathtext2/.
Supplementary data are available at Bioinformatics online.
Comparatively few studies have addressed directly the question of quantifying the benefits to be had from using molecular genetic markers in experimental breeding programmes (e.g. for improved crops and livestock), nor the question of which organisms should be mated with each other to best effect. We argue that this requires in silico modelling, an approach for which there is a large literature in the field of evolutionary computation (EC), but which has not really been applied in this way to experimental breeding programmes. EC seeks to optimise measurable outcomes (phenotypic fitnesses) by optimising in silico the mutation, recombination and selection regimes that are used. We review some of the approaches from EC, and compare experimentally, using a biologically relevant in silico landscape, some algorithms that have knowledge of where they are in the (genotypic) search space (G-algorithms) with some (albeit well-tuned ones) that do not (F-algorithms). For the present kinds of landscapes, F- and G-algorithms were broadly comparable in quality and effectiveness, although we recognise that the G-algorithms were not equipped with any ‘prior knowledge’ of epistatic pathway interactions. This use of algorithms based on machine learning has important implications for the optimisation of experimental breeding programmes in the post-genomic era when we shall potentially have access to the full genome sequence of every organism in a breeding population. The non-proprietary code that we have used is made freely available (via Supplementary information).
The soil represents a reservoir that contains at least twice as much carbon as does the atmosphere, yet (apart from ‘root crops’) mainly just the above-ground plant biomass is harvested in agriculture, and plant photosynthesis represents the effective origin of the overwhelming bulk of soil carbon. However, present estimates of the carbon sequestration potential of soils are based more on what is happening now than what might be changed by active agricultural intervention, and tend to concentrate only on the first metre of soil depth.
Breeding crop plants with deeper and bushy root ecosystems could simultaneously improve both the soil structure and its steady-state carbon, water and nutrient retention, as well as sustainable plant yields. The carbon that can be sequestered in the steady state by increasing the rooting depths of crop plants and grasses from, say, 1 m to 2 m depends significantly on its lifetime(s) in different molecular forms in the soil, but calculations (http://dbkgroup.org/carbonsequestration/rootsystem.html) suggest that this breeding strategy could have a hugely beneficial effect in stabilizing atmospheric CO2. This sets an important research agenda, and the breeding of plants with improved and deep rooting habits and architectures is a goal well worth pursuing.
Breeding; deep roots; genetics; root architecture; carbon sequestration; nutrient efficiency; drought resistance; soil structure; perenniality
A considerable number of areas of bioscience, including gene and drug discovery, metabolic engineering for the biotechnological improvement of organisms, and the processes of natural and directed evolution, are best viewed in terms of a ‘landscape’ representing a large search space of possible solutions or experiments populated by a considerably smaller number of actual solutions that then emerge. This is what makes these problems ‘hard’, but as such these are to be seen as combinatorial optimisation problems that are best attacked by heuristic methods known from that field. Such landscapes, which may also represent or include multiple objectives, are effectively modelled in silico, with modern active learning algorithms such as those based on Darwinian evolution providing guidance, using existing knowledge, as to what is the ‘best’ experiment to do next. An awareness, and the application, of these methods can thereby enhance the scientific discovery process considerably. This analysis fits comfortably with an emerging epistemology that sees scientific reasoning, the search for solutions, and scientific discovery as Bayesian processes.
automation; epistemology; evolutionary computing; heuristics; scientific discovery
Constraint-based analysis of genome-scale metabolic models typically relies upon maximisation of a cellular objective function such as the rate or efficiency of biomass production. Whilst this assumption may be valid in the case of microorganisms growing under certain conditions, it is likely invalid in general, and especially for multicellular organisms, where cellular objectives differ greatly both between and within cell types. Moreover, for the purposes of biotechnological applications, it is normally the flux to a specific metabolite or product that is of interest rather than the rate of production of biomass per se.
An alternative objective function is presented, that is based upon maximising the correlation between experimentally measured absolute gene expression data and predicted internal reaction fluxes. Using quantitative transcriptomics data acquired from Saccharomyces cerevisiae cultures under two growth conditions, the method outperforms traditional approaches for predicting experimentally measured exometabolic flux that are reliant upon maximisation of the rate of biomass production.
Due to its improved prediction of experimentally measured metabolic fluxes, and of its lack of a requirement for knowledge of the biomass composition of the organism under the conditions of interest, the approach is likely to be of rather general utility. The method has been shown to predict fluxes reliably in single cellular systems. Subsequent work will investigate the method’s ability to generate condition- and tissue-specific flux predictions in multicellular organisms.
Flux balance analysis; Metabolic flux; Metabolic networks; Transcriptomics; RNA-Seq; Exometabolomics
The soil holds twice as much carbon as does the atmosphere, and most soil carbon is derived from recent photosynthesis that takes carbon into root structures and further into below-ground storage via exudates therefrom. Nonetheless, many natural and most agricultural crops have roots that extend only to about 1 m below ground. What determines the lifetime of below-ground C in various forms is not well understood, and understanding these processes is therefore key to optimising them for enhanced C sequestration. Most soils (and especially subsoils) are very far from being saturated with organic carbon, and calculations show that the amounts of C that might further be sequestered (http://dbkgroup.org/carbonsequestration/rootsystem.html) are actually very great. Breeding crops with desirable below-ground C sequestration traits, and exploiting attendant agronomic practices optimised for individual species in their relevant environments, are therefore important goals. These bring additional benefits related to improvements in soil structure and in the usage of other nutrients and water.
soil; carbon; sequestration; systems biology; breeding
The control of biochemical fluxes is distributed and to perturb complex intracellular networks effectively it is often necessary to modulate several steps simultaneously. However, the number of possible permutations leads to a combinatorial explosion in the number of experiments that would have to be performed in a complete analysis. We used a multi-objective evolutionary algorithm (EA) to optimize reagent combinations from a dynamic chemical library of 33 compounds with established or predicted targets in the regulatory network controlling IL-1β expression. The EA converged on excellent solutions within 11 generations during which we studied just 550 combinations out of the potential search space of ~ 9 billion. The top five reagents with the greatest contribution to combinatorial effects throughout the EA were then optimized pairwise. A p38 MAPK inhibitor with either an inhibitor of IκB kinase or a chelator of poorly liganded iron yielded synergistic inhibition of macrophage IL-1β expression. Evolutionary searches provide a powerful and general approach to the discovery of novel combinations of pharmacological agents with potentially greater therapeutic indices than those of single drugs.
Research into event-based text mining from the biomedical literature has been growing in popularity to facilitate the development of advanced biomedical text mining systems. Such technology permits advanced search, which goes beyond document or sentence-based retrieval. However, existing event-based systems typically ignore additional information within the textual context of events that can determine, amongst other things, whether an event represents a fact, hypothesis, experimental result or analysis of results, whether it describes new or previously reported knowledge, and whether it is speculated or negated. We refer to such contextual information as meta-knowledge. The automatic recognition of such information can permit the training of systems allowing finer-grained searching of events according to the meta-knowledge that is associated with them.
Based on a corpus of 1,000 MEDLINE abstracts, fully manually annotated with both events and associated meta-knowledge, we have constructed a machine learning-based system that automatically assigns meta-knowledge information to events. This system has been integrated into EventMine, a state-of-the-art event extraction system, in order to create a more advanced system (EventMine-MK) that not only extracts events from text automatically, but also assigns five different types of meta-knowledge to these events. The meta-knowledge assignment module of EventMine-MK performs with macro-averaged F-scores in the range of 57-87% on the BioNLP’09 Shared Task corpus. EventMine-MK has been evaluated on the BioNLP’09 Shared Task subtask of detecting negated and speculated events. Our results show that EventMine-MK can outperform other state-of-the-art systems that participated in this task.
We have constructed the first practical system that extracts both events and associated, detailed meta-knowledge information from biomedical literature. The automatically assigned meta-knowledge information can be used to refine search systems, in order to provide an extra search layer beyond entities and assertions, dealing with phenomena such as rhetorical intent, speculations, contradictions and negations. This finer grained search functionality can assist in several important tasks, e.g., database curation (by locating new experimental knowledge) and pathway enrichment (by providing information for inference). To allow easy integration into text mining systems, EventMine-MK is provided as a UIMA component that can be used in the interoperable text mining infrastructure, U-Compare.
White's lab established that strong, continuous stimulation with tumour necrosis factor-α (TNFα) can induce sustained oscillations in the subcellular localisation of the transcription factor nuclear factor κB (NF-κB). But the intensity of the TNFα signal varies substantially, from picomolar in the blood plasma of healthy organisms to nanomolar in diseased states. We report on a systematic survey using computational bifurcation theory to explore the relationship between the intensity of TNFα stimulation and the existence of sustained NF-κB oscillations. Using a deterministic model developed by Ashall et al. in 2009, we find that the system's responses to TNFα are characterised by a supercritical Hopf bifurcation point: above a critical intensity of TNFα the system exhibits sustained oscillations in NF-kB localisation. For TNFα below this critical value, damped oscillations are observed. This picture depends, however, on the values of the model's other parameters. When the values of certain reaction rates are altered the response of the signalling pathway to TNFα stimulation changes: in addition to the sustained oscillations induced by high-dose stimulation, a second oscillatory regime appears at much lower doses. Finally, we define scores to quantify the sensitivity of the dynamics of the system to variation in its parameters and use these scores to establish that the qualitative dynamics are most sensitive to the details of NF-κB mediated gene transcription.
► The system's responses to TNFa are characterised by a supercritical Hopf bifurcation. ► A second oscillatory regime appears at much lower doses at certain reaction rates. ► The qualitative dynamics are most sensitive to the details of NF-kB mediated gene transcription.
NF-κB signalling pathway; Parameter sensitivity; Bifurcation analysis; Oscillations