Adverse drug events (ADEs) are common and account for 770 000 injuries and deaths each year and drug interactions account for as much as 30% of these ADEs. Spontaneous reporting systems routinely collect ADEs from patients on complex combinations of medications and provide an opportunity to discover unexpected drug interactions. Unfortunately, current algorithms for such “signal detection” are limited by underreporting of interactions that are not expected. We present a novel method to identify latent drug interaction signals in the case of underreporting.
Materials and Methods
We identified eight clinically significant adverse events. We used the FDA's Adverse Event Reporting System to build profiles for these adverse events based on the side effects of drugs known to produce them. We then looked for pairs of drugs that match these single-drug profiles in order to predict potential interactions. We evaluated these interactions in two independent data sets and also through a retrospective analysis of the Stanford Hospital electronic medical records.
We identified 171 novel drug interactions (for eight adverse event categories) that are significantly enriched for known drug interactions (p=0.0009) and used the electronic medical record for independently testing drug interaction hypotheses using multivariate statistical models with covariates.
Our method provides an option for detecting hidden interactions in spontaneous reporting systems by using side effect profiles to infer the presence of unreported adverse events.
Drug interactions; signal detection analysis; adverse effects; pharmacoepidemiology
A review of 2010 research in translational bioinformatics provides much to marvel at. We have seen notable advances in personal genomics, pharmacogenetics, and sequencing. At the same time, the infrastructure for the field has burgeoned. While acknowledging that, according to researchers, the members of this field tend to be overly optimistic, the authors predict a bright future.
Translational bioinformatics; computational biology; genomics; electronic medical records
ADORA2A; caffeine; CYP1A2; pathway; pharmacogenomics
cardiovascular toxicity; colon cancer; COX-2; coxibs; celecoxib; CYP2C9; drug response; inflammation; nonsteroidal anti-inflammatory drugs; pathway; pharmacogenomics; selective COX-2 inhibitors
As public microarray repositories rapidly accumulate gene expression data, these resources contain increasingly valuable information about cellular processes in human biology. This presents a unique opportunity for intelligent data mining methods to extract information about the transcriptional modules underlying these biological processes. Modeling cellular gene expression as a combination of functional modules, we use independent component analysis (ICA) to derive 423 fundamental components of human biology from a 9,395-array compendium of heterogeneous expression data. Annotation using the Gene Ontology (GO) suggests that while some of these components represent known biological modules, others may describe biology not well characterized by existing manually-curated ontologies. In order to understand the biological functions represented by these modules, we investigate the mechanism of the preclinical anticancer drug parthenolide (PTL) by analyzing the differential expression of our fundamental components. Our method correctly identifies known pathways and predicts that N-glycan biosynthesis and T-cell receptor signaling may contribute to PTL response. The fundamental gene modules we describe have the potential to provide pathway-level insight into new gene expression datasets.
microarrays; independent component analysis; data mining; parthenolide; gene modules
There is debate about the utility of clinical data warehouses for research. Using a clinical warfarin dosing algorithm derived from research-quality data, we evaluated the data quality of both a general-purpose database and a coagulation-specific database. We evaluated the functional utility of these repositories by using data extracted from them to predict warfarin dose. We reasoned that high-quality clinical data would predict doses nearly as accurately as research data, while poor-quality clinical data would predict doses less accurately. We evaluated the Mean Absolute Error (MAE) in predicted weekly dose as a metric of data quality. The MAE was comparable between the clinical gold standard (10.1 mg/wk) and the specialty database (10.4 mg/wk), but the MAE for the clinical warehouse was 40% greater (14.1 mg/wk). Our results indicate that the research utility of clinical data collected in focused clinical settings is greater than that of data collected during general-purpose clinical care.
clinical; translational; database; warehouse; research; quality; warfarin; dosing; STRIDE; CoagClinic
drug-induced oxidative stress; glucose-6-phosphate dehydrogenase deficiency; hemolytic anemia; pharmacodynamics; pharmacokinetics; polymorphic variants
Warfarin dosing remains challenging because of its narrow therapeutic window and large variability in dose response. We sought to analyze new factors involved in its dosing and to evaluate eight dosing algorithms, including two developed by the International Warfarin Pharmacogenetics Consortium (IWPC).
we enrolled 108 patients on chronic warfarin therapy and obtained complete clinical and pharmacy records; we genotyped single nucleotide polymorphisms relevant to the VKORC1, CYP2C9, and CYP4F2 genes using integrated fluidic circuits made by Fluidigm.
When applying the IWPC pharmacogenetic algorithm to our cohort of patients, the percentage of patients within 1 mg/d of the therapeutic warfarin dose increases from 54% to 63% using clinical factors only, or from 38% using a fixed-dose approach. CYP4F2 adds 4% to the fraction of the variability in dose (R2) explained by the IWPC pharmacogenetic algorithm (P < 0.05). Importantly, we show that pooling rare variants substantially increases the R2 for CYP2C9 (rare variants: P =0.0065, R2 = 6%; common variants: P= 0.0034, R2 = 7%; rare and common variants: P =0.00018; R2 = 12%), indicating that relatively rare variants not genotyped in genome-wide association studies may be important. In addition, the IWPC pharmacogenetic algorithm and the Gage (2008) algorithm perform best (IWPC: R2 = 50%; Gage: R2 = 49%), and all pharmacogenetic algorithms outperform the IWPC clinical equation (R2 = 22%). VKORC1 and CYP2C9 genotypes did not affect long-term variability in dose. Finally, the Fluidigm platform, a novel warfarin genotyping method, showed 99.65% concordance between different operators and instruments.
CYP4F2 and pooled rare variants of CYP2C9 significantly improve the ability to estimate warfarin dose.
algorithms; CYP2C9; CYP4F2; dosing; IWPC; kinetics; pharmacogenetics; rare variants; VKORC1; warfarin
drug response; genetic variants; pharmacogenomics; vitamin D receptor
CYP1A2; caffeine; pharmacogene; pharmGKB
carbamazepine; cytochrome P450 metabolizing enzymes; HLA-B; pharmacogenomics; pharmacokinetics
CYP2A6; inter-individual variation; pharmacokinetics; genetic polymorphisms; drug metabolism; drug efficacy
The number of molecules with solved three-dimensional structure but unknown function is increasing rapidly. Particularly problematic are novel folds with little detectable similarity to molecules of known function. Experimental assays can determine the functions of such molecules, but are time-consuming and expensive. Computational approaches can identify potential functional sites; however, these approaches generally rely on single static structures and do not use information about dynamics. In fact, structural dynamics can enhance function prediction: we coupled molecular dynamics simulations with structure-based function prediction algorithms that identify Ca2+ binding sites. When applied to 11 challenging proteins, both methods showed substantial improvement in performance, revealing 22 more sites in one case and 12 more in the other, with a modest increase in apparent false positives. Thus, we show that treating molecules as dynamic entities improves the performance of structure-based function prediction methods.
citalopram; escitalopram; pharmacogenomics; pharmacokinetics; pharmGKB; selective serotonin reuptake inhibitor
CYP3A5; CYP3A5*2; CYP3A5*3; CYP3A5*6; CYP3A5*7; pharmacogenomics; rs10264272; rs28365083; rs76293380; rs776746
cyclooxygenase-2; coxibs; non-steroidal anti-inflammatory drugs; pharmacogenomics; PTGS2; rs20417; rs5275; rs689466
aspirin; clopidogrel; glycoprotein IIb– IIIa inhibitors; pharmacogenomics; PharmGKB; platelet activation; platelet aggregation; polymorphism
dopamine receptor D2; PharmGKB; rs1799732; rs1800497; rs6277; rs1801028
CYP2C19; CYP2C9; HLA-B; pathway; pharmacogenomic; pharmacokinetics; phenytoin; SCN1A
CYP2J; CYP2J2; CYP2J2*7; epoxygenase; PharmGKB; rs890293
candidate genes; dihydropyrimidine dehydrogenase; drug efficacy; drug toxicity; fluoropyrimidines; fluorouracil; methylenetetrahydrofolate reductase; pharmacogenomics; pyrimidine analogs; thymidylate synthase
ABC transporter; drug permeability; multidrug resistance; pharmacogenomics; PharmGKB
SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases.
The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively.
WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go.
HMGCR; 3-hydroxy-3-methylglutaryl coenzyme A reductase; PharmGKB; pravastatin; statin