|Home | About | Journals | Submit | Contact Us | Français|
We examine how physiology and pathophysiology are studied from a systems perspective, using high-throughput experiments and computational analysis of regulatory networks. We describe the integration of these analyses with pharmacology, which leads to new understanding of drug action and enables drug discovery for complex diseases. Network studies of drug-target relationships can serve as an indication on the general trends in the approved drugs and the drug-discovery progress. There is a growing number of targeted therapies approved and in the pipeline, which meets a new set of problems with efficacy and adverse effects. The pitfalls of these mechanistically based drugs are described, along with how a systems view of drug action is increasingly important to uncover intricate signaling mechanisms that play an important part in drug action, resistance mechanisms, and off-target effects. Computational methodologies enable the classification of drugs according to their structures and to which proteins they bind. Recent studies have combined the structural analyses with analysis of regulatory networks to make predictions about the therapeutic effects of drugs for complex diseases and possible off-target effects.
Classical studies such as the development of receptor theory by Clark1 and Black,2,3 followed by analyses that distinguished between competitive and noncompetitive inhibition, served as the initial insight on the molecular mechanisms of drug action.4 The influence and relevance of receptor theory in modern pharmacology are due to the large number of drugs that target membrane receptors, mostly G-protein-coupled receptors (GPCRs). A majority of the remaining drugs are enzyme inhibitors, which are generally substrate-based analogs, transition-state analogs, or allosteric inhibitors, designed based on the structures of the substrate and the substrate-binding pocket to bind reversibly or irreversibly. Over the past decade, genomics, proteomics, network analyses, and other high-throughput studies have led to a wealth of “systems-level” information. Through combinations of the human genome studies, high-throughput technologies, and structural and biochemical studies, the number of well-characterized, druggable targets is constantly increasing.5,6 Hence, the drug pipeline has grown, with a focus on complex, multigenic diseases resulting in the appearance of targeted therapies and biological therapeutics, such as kinase inhibitors and monoclonal antibody therapies. Despite these advances, the overall rate of development of new drugs has greatly decreased.
There are more than 1500 drugs approved by the US Food and Drug Administration (FDA), but only approximately 400 unique proteins are targeted by these approved drugs (the “drugome”) (Figure 1). The diseaseome is defined as the current ~3700 unique entries in the Online Mendelian Inheritance in Man (OMIM) database and covers a tiny fraction of the proteome, which is estimated at approximately 100,000 proteins. The research drugome (the proteins targeted by approved drugs and/or drugs in research and clinical trials) overlaps with a portion (500 protein targets) of the diseaseome (Figure 1), and is focused in 15 to 20 therapeutic areas.7,8 This means that about two-thirds of current drugs do not target a gene product that has been associated with disease.
The drug-discovery process is hindered by the cost and time of the research and development efforts necessary to take a drug from clinical trial to market: $1 billion and from 5 to 10 years, on average. Furthermore, many drugs fail in phase III trials due to lack of efficacy (no significant difference between the placebo and drug treatment or between drug and current treatment alternative). Considering the extremely high cost of new-drug development and the high drug attrition rate, it is evident that new approaches are required to improve and expedite the process. Since most current drug-discovery efforts are targeted toward complex diseases, understanding the pathophysiology at a systems level could aid in the design and development of drugs with higher success rates in clinical trials.
The broadly defined field of systems biology has generated a large amount of accumulated knowledge on the biological processes involved in disease and drug action. In this review, we will provide an overview of how systems biology has been used to analyze the molecular mechanisms of human disease, current drug-target relationships, and mechanisms of targeted therapies, which is cumulatively termed “systems pharmacology.”
It is currently understood that single-gene disorders are rare and most genetic diseases are caused by the concerted action of many genetic abnormalities, along with other possible nongenetic influences such as environmental factors, hormone levels, age, weight, and diet. Consequently, the treatment of diseases with a single, “magic bullet” therapy has proven largely unsuccessful. The current understanding of pathophysiology is complicated by the large amount of information that has amassed over the years from biochemical and cell-biological studies, and a systems view is necessary to integrate all the factors that control the regulatory networks involved in orchestrating the physiological and pathophysiological responses.
In order to gain a systems-level, molecular perspective of disease, a network view of the relevant physiological function is essential.
Each physiological process has an underlying signaling network of chemicals, hormones, native ligands, protein receptors, ions, enzymes, transcription factors, DNA with associated epigenetic mechanisms, and RNA, both regulatory and protein encoding. Together these components control the level of various cellular components to modulate biochemical reactions, electrical signals, force generation, transcription, and protein translation. These cellular processes can occur with different kinetics, simultaneously and/or at different times, and at varying levels of magnitude. Therefore, each physiological function or phenotype has a very complex network of signals controlling it. To represent each physiological process in a parsable network, each protein, ion, or chemical becomes a “node.” and each interaction between 2 nodes (binding, chemical reaction, or other interaction) becomes an “edge” (Figure 2A, Table 1). Signaling networks can be represented as undirected graphs wherein interactions are specified but without hierarchy, or as directed graphs where the effect of 1 node on another is specified by the direction of the arrow. More specifically, directed graphs can have sign specification, such as positive for activation, negative for inhibition, or neutral interactions for scaffolds, which are represented differently by arrows, symbols, or colors. The hypothetical network in Figure 2A is directional, with activating interactions represented as arrows, inhibitory interactions as plungers, and neutral interactions as lines. In biological systems, signaling networks are not unidirectional9; instead, they are highly clustered. Certain nodes, termed “hubs” (Figure 2B), are highly connected,10 thus allowing many more components to be connected to other subnetworks. On the other hand, a system can contain “islands” (Figure 2B), sets of nodes not connected to the overall network. Several types of motifs exist within regulatory networks, such as negative feedback loops, positive feedback loops, and feed-forward loops (Figure 2B).9 These characteristics, as well as the high clustering coefficient, are attributed to the scale-free (a connectivity distribution that follows a power law11) and redundant properties of biological networks. These properties also allow for network perturbation without complete loss of function. Thus, a single imbalance in the regulatory network of a biological system, such as an increased protein level or biochemical reaction rate, does not necessarily lead to a disease state. Instead, within the regulatory network of a disease system, the signaling networks underlying the pathophysiology are likely to be persistently perturbed at more than 1 point (edge or node).
The genomic irregularities associated with disease have been studied, through genome-wide association and positional cloning studies, which have become more efficient with the sequencing of the human genome. More recently, numerous genomic, short hairpin/small interfering RNA (shRNA/siRNA), proteomic, and phosphoproteomic studies have focused on the gene-expression signatures and protein-signaling pathways associated with a particular disease (Table 2).12–17 The plethora of data on the genetic mutations and variations produced by studies such as these has been compiled into several databases, such as the OMIM database and the Catalogue of Somatic Mutations in Cancer (COSMIC) database. These global databases help to identify genes, proteins, and other cellular components related to the origin or progression of a specific disease. These components can be used to generate lists of “seed nodes” that enable us to computationally construct various types of networks by integrating knowledge about gene-disease associations and/or protein interactions reported in biomedical literature. Through information about the properties of disease genes as nodes in a network, previously unrelated relationships between disease pathways and cellular functions can be identified, which can serve as a first step in identifying drug targets.18–20
An interactive, computable database, the Connectivity Map,21 finds correlations between gene-expression signatures and the sets of proteins involved with the action of a class of drugs or with a particular disease. The initial Connectivity Map project analyzed the genome-wide messenger RNA (mRNA) expression profiles of 1 cell type after treatment with 164 small molecules and perturbagens (a subset was tested in 3 other cell lines). Instead of using traditional hierarchical clustering for analyzing the genetic data, the authors developed and adopted a method termed “gene set enrichment analysis,” a nonparametric, rank-based pattern-matching strategy,21 with the idea that regardless of the cell type, the gene-expression data specific to perturbation by a small molecule could be detected. The gene-expression signatures could be queried to search for commonalities among expression signatures for seemingly unrelated drugs or drug classes. This tool aids in the empirical understanding of current drug action and the discovery and characterization of new drug entities.22
In another example, Goh et al.19 examined gene-disorder association data in 2 networks: (1) the human-disease network, which treats human diseases as nodes, connected based on the associated genes they have in common, and (2) the disease-gene network, which connects disease genes according to whether they are associated with the same disorder. The data were used to construct a bipartite interaction network using 1284 known disorders and 1177 known disease genes from the OMIM database. The diseases were further broken down into 22 classes. Both networks were highly connected and had a “giant component,” a hub-like area where many nodes were connected to one another. In the human disease network, cancer and related disorders and neurological disorders were in highly connected, heterogeneous hubs, whereas metabolic, skeletal, and other disorders were in lesser-connected “islands.” With a wide distribution of gene associations, most diseases associate with a few genes, whereas some disorders associated with a much higher number of genes (30 to 50). There were numerous islands, more notably in the disease-gene network. It was previously demonstrated that disease-related genes are hubs or have a higher number of interactions than nondisease proteins.
The majority of the gene list used in the human disease and disease gene networks contained nonessential genes (“essential” defined as being embryonic/postnatal lethal).19 It was observed that with essential genes excluded from the disease-gene network, the connectivity greatly diminishes, indicating that the higher connectivity was observed solely due to the high connectivity inherent to the essential genes. Furthermore, the expression patterns of disease genes were asynchronous with genes essential for cellular functions, and the disease genes had little overlap with housekeeping genes. This indicated that disease genes are often peripheral, with 1 important exception: somatic mutations found in cancer, which did not fit the analysis described above.
In another analysis of the properties of disease genes from a systems perspective, Zhong et al.20 showed that genetic mutations that perturb networks at an edge (edgetic: mutated nearly full-length or point-mutated full-length gene products) rather than at a node (null: truncated gene products) are more heavily responsible for the phenotypic manifestation of the diseases.20 Edgetic perturbations are more commonly found in mutations that span multiple diseases and are caused by in-frame mutations and the expression of a nearly full-length gene product, which lacks an interaction with a neighboring node (Figure 2C). In contrast, node removal or a null mutation includes severely truncated gene products or other mutations that destabilize protein structure and equate to the removal of the protein with respect to its function and interactions (Figure 2C). Edgetic perturbations have potential consequences at more than 1 node in the network. Furthermore, the authors cited examples where different mutations of the same gene product affect different edges, and each edgetic mutation is associated with a different disorder.
Systematically organizing relationships between diseases, drugs, and proteins into a map such as a signaling network enables statistical inferences about less-obvious relationships between them. Global analyses of FDA-approved drugs have found that, unlike ubiquitously expressed essential genes, drug targets tend to be expressed in specific tissues but interact with many other proteins in the cellular network while being independently regulated.8 Commonalities of signaling mechanisms among existing drug targets create a basis for identifying the network properties that define a potentially good drug target. Statistical metrics from network analyses, such as a centrality measure, which quantifies the relative importance of a protein in communicating between different modules within a network, have been suggested for identifying nodes (proteins in a network) that have attractive properties as potential drug targets.23
Ma’ayan et al. constructed a bipartite graph of 1052 FDA-approved drugs interacting with 485 targets contained 179 islands, using data from DrugBank.24,25 Most of these islands are made up of 10 to 30 interacting cellular components. Drugs that target GPCRs comprise a single large island of 481 components.26 A majority of targets of currently approved drugs are GPCRs (~25%),5 which govern a diversity of physiological processes such as cardiac contractility, acid secretion, and airway constriction.
In a study by Yildirim et al.,8 a bipartite graph of drugs and their targets listed in DrugBank was generated. Analysis of this drug-target network showed that the majority (62%) of currently approved drugs target membrane proteins, most likely the bulk of which are GPCRs, as indicated by the results of study by Ma’ayan et al.26 In the analysis of the distribution of the targets of research and current drugs, the drug-target identity distribution became more diverse, with cytoplasmic targets doubling (6% to 12%). This result is indicative that drug-discovery technologies have improved to target nonmembrane proteins and the emergence of targeted therapies.
Targeted therapies or signal transduction inhibitors make up the bulk of the current drug pipeline. Of all drug targets, the majority (25%) target GPCRs, followed by kinases (10%).5 The shift from GPCR drugs to other receptor classes and protein kinase–targeted drugs (>100 kinase inhibitors are in development and 15 are currently approved as cancer therapies, Table 3) is representative of the cumulative knowledge of the mechanisms underlying disease. The obvious advantage of targeted therapies is their mechanistic derivation, targeted to the specific signaling molecule (typically a kinase) that exhibits aberrant behavior in disease. Many targeted therapies target a new area and the risk of failure is higher if the drug has a novel mechanism of action.27 Other disadvantages lie in the difficulty of achieving high specificity and the resulting off-target effects and primary and acquired mechanisms of resistance to these therapies. Complex signaling mechanisms and motifs often play a role in resistance, underscoring the importance of a systems-pharmacology understanding in the target choice and drug design.
The bulk of targeted therapies is used to treat cancer, which this discussion will focus on. The first example of a targeted therapy in cancer was tamoxifen, which specifically inhibits the estrogen receptor in estrogen receptor–positive breast cancers.28 The first successful kinase inhibitor was the BCR-ABL inhibitor imatinib, which is approved for treating chronic myelogenous leukemia. Other examples (Table 2) include human epidermal growth factor receptor family (EGFR and HER-2/ErbB2) inhibitors such as monoclonal antibody therapies trastuzumab, cetuximab, and panitumumab, and the tyrosine kinase inhibitors erlotinib, gefitinib, and lapatinib.
By design, these therapies are administered to select patient populations that exhibit a specific mutation or heightened expression of the protein target. For example, prior to receiving trastuzumab, patients must test positive for elevated ErbB2 expression29,30; and to be enrolled in a cetuximab combination clinical trial for pancreatic cancer, biopsies must show heightened EGFR expression.31 On the other hand, patients are also tested for primary mechanisms of resistance to these drugs, such as K-Ras mutations mitigating the response to EGFR inhibitors. Because these drugs target a specific signaling molecule, acquired resistance is a significant problem. For example, the T790M mutation in EGFR increases the affinity of adenosine triphosphate (ATP) for the EFGR active site but ablates the affinity of the EGFR kinase inhibitor gefitinib. Also, various mutations in mitogen-activated/extracellular signal-regulated kinase kinase (MEK) were exhibited by tumors which relapsed following treatment with the MEK inhibitor AZD6244, which is currently in clinical trials.32
Other types of acquired resistance are caused by compensatory mechanisms, which are distinct from those encountered with nonselective chemotherapies which result from increase in the levels of the drug transporters like the ATP-binding cassette protein family. Instead, the resistance mechanisms involve nodes within the signaling network of the targeted protein. One example of resistance to EGFR therapy is the maintenance of phosphoinositide 3-kinase (PI3K) signaling through a variety of mechanisms, including loss of phosphatase and tensin homolog (PTEN) (a phosphatase and negative regulator of V-akt murine thymoma viral oncogene homolog [AKT]), ErbB3, or methionine (Met) amplification, which allows for tumor cell survival despite EGFR inhibition.33 Additionally, in gefitinib-resistant and cetuximab-resistant cells, the HER signaling switches from EGFR-containing heterodimers and homodimers to primarily HER2-Her3 signaling dimers, which more potently activates the PI3K/AKT pathway.32,34
In targeting a specific component in a complex signaling pathway, there is potential to disrupt all of the intrinsic signaling mechanisms associated with that particular node, which can have counterintuitive results.
For example, due to prevalence of mutations upstream of extracellular signal-regulated kinase (ERK) in cancers, such as mutant Ras and B-Raf, several B-Raf and MEK inhibitors are currently approved or in development. Due to the complexity of the signaling pathways, targeted inhibition of these molecules in cancer cells can cause unwanted signal amplification or bypass the inhibition within the targeted network. For instance, overactive B-Raf will hyperactivate the MEK–mitogen-activated protein kinase (MEK-MAPK) pathway, similar to oncogenic Ras, and inhibition of MEK should normalize this aberrancy, whether it originates with B-Raf or Ras mutations (Figure 3). However, MEK inhibition caused an increase in PI3K/AKT activation, and, vice versa, PI3K/AKT inhibition caused an increase in MAPK activation (Figure 3B).35,36 These paradoxical results are caused by a negative feedback loop in receptor tyrosine kinase (RTK) signaling that is mediated by TSC2-P70S6K and controlled by AKT and ERK. Hence, when MEK/ERK or PI3K/AKT is inhibited, a negative feedback loop is partially shut down, causing signal amplification in the noninhibited arm of the pathway (Figure 3B).
In another example, treatment with the MEK inhibitor (AZD6244) reduced proliferation cells that harbor the activating B-Raf V600E mutation.37 However, in cells harboring activating K-Ras mutations with wild-type B-Raf, AZD62444 treatment caused accumulation of phospho-MEK, partially due to ablation of a negative feedback loop between MEK and B-Raf, mediated by Sprouty2 (Figure 3B).38 This negative feedback loop is disrupted by the B-Raf V600E mutation, due to reduced affinity between mutant B-Raf and Sprouty2. Hence, response to AZD62444 is observed in cells harboring the V600E mutation.
Two recent, pertinent studies describe how B-Raf inhibition drives Ras-mediated B-Raf binding to and activation of C-Raf, followed by MEK-ERK activation (Figure 3C).39,40 In some B-Raf/Ras wild type (WT) cell lines and B-Raf WT/Ras-mutant cell lines, Raf-inhibitor treatment caused an increase in tumor cell proliferation. B-Raf inhibitor(s)’ binding of the B-Raf ATP pocket potentiated the formation of B-Raf/C-Raf heterodimers at the membrane, in a Ras-dependent manner. Increased heterodimer formation correlated with increased C-Raf activity and increased MEK activation. MEK activation did not occur in drugs that inhibit both B-Raf and C-Raf. 39 Another study on AKT inhibitors produced similar results. When AKT was specifically inhibited, a pool of hyperphosphorylated AKT occurred, due to the “priming” caused by inhibitor binding the ATP pocket of AKT.41
These scenarios exemplify that network-signaling motifs play a role in drug action, highlighting the important role that the intricacies of the signaling pathways play in governing the effects of these targeted therapies.
A challenge in applying systems-pharmacology methods in translational medicine is the identification of complex signaling mechanisms early in the drug-discovery process. A first and necessary step to address this challenge in the clinic is obtaining as much data as possible on pertinent mutations and genomic variations, especially with complex diseases in which variable and multiple genetic factors lead to a diseased state. While a set of parameters does not exist to classify the gene products or interactions that will play a critical role in disease or drug action, establishing a systems perspective can lead to the detection of unforeseen effects of modulating a single component in a biological signaling network. In the early stages of considering a drug target, the following steps can be taken to enable a systems perspective and all the possibility of uncovering unforeseen consequences of modulating the activity of the drug target: (1) Using the literature and databases, identify all components that interact with the drug target of interest and the signaling pathways/networks with which these components are involved. (2) Construct a directed graph of all of these components (Figure 2), including all direct interactions and ones that lead to physiologically significant events (ie, ones that lead to a major result such as transcription, cell survival/death, motility, or others). (3) Identify redundant pathways and signaling motifs such as feedback loops, which may cause signal amplification, compensation, or other effects in the specific system of interest. (4) Establish a cell-based model to test these scenarios and take into account the results when considering the drug target and combining or prescribing drugs in a given genetic background.
Often, drugs bind to several targets that are seemingly unrelated except for the target-binding pattern of the drug. Recent work has generated a network connecting drugs on the basis of their structural similarity and similarity of side-effect profiles.42 This method has been proven effective at identifying groups of drugs that share common targets. When drugs with predicted shared targets did not have known targets in common, the predictions provided a framework for testing these drugs for binding against the predicted targets. Hence, this approach led to the identification of new targets for existing drugs, which serve as new therapeutic targets or explain the cause(s) of an adverse effect.
There have been recent developments in “chemo-centric” methodologies to systematically compare the structures of drugs with known targets, as a means of predicting unknown target binding. The idea of searching a database of structures for a similar structure has been around a long time in the field of chemistry. For decades, chemists have been using structure-based literature searches for reaction methods, mechanisms, chemical information, and various analyses.43 New approaches allow databases of structural and pharmacological data to be parsed using chemical structure, provide information on the global properties of compound libraries with respect to their biological properties, and predict binding interactions.
A major feat in establishing this methodology was the assembly of a large database of chemical entities, of which ~275,000 are curated with structure-activity relationship data (target identity, IC50, EC50, Ki, and KD values) from >2800 targets.44 The human subset of targets (700) and the respective drugs they bind was used to create a map of the protein targets (nodes) and how they are connected in chemical space (1 molecule binding 2 proteins creates an edge). The promiscuity of each target was analyzed, with aminergic GPCRs, cytochrome P450, and protein kinases ranking the highest. In a similar analysis, it was found that compounds generally targeted proteins from the same gene families. The method was also used to predict pharmacology of compounds and analyze how the physical properties of drugs differ within each target family. Unfortunately, the database is not publicly available and is proprietary to Pfizer, where the study took place,44 but it represents a cumulative search tool, a novel method of searching for new leads in the drug-discovery process and improving the physical properties of lead compounds.
Another approach, coined the Similarity Ensemble Approach (SEA), classifies ligands, with known receptor interactions, by their chemical similarity using the Tanimoto coefficient of chemical similarity (Tc), a fragment-based algorithm.45 Unrelated structures are parsed with these codes to predict with which protein targets they might interact. Comparing drug-target interactions by way of protein structure comparisons is much more difficult than by way of drug structure comparisons, due to the sheer difference in size and structural complexity. Central to this methodology is the idea that small molecules with similar structures will display similar target-binding properties. A recent study used this method and predicted that Paxil and Prozac (serotonin reuptake inhibitors) are also β-blockers.23 The binding of these drugs to the β-adrenergic receptor was experimentally verified. The authors speculated that some observed adverse effects of the drugs, such as increased heart rate and sexual dysfunction, could be attributed to the off-target binding.
Basic cell biological research continues to delineate the functions of each protein in the proteome as well as metabolites in the metabolome to construct the “interactome” (how all cellular components interact) in both physiological and pathophysiological systems. The construction of the interactomes for specific pathophysiology can be quite useful for uncovering potential new drug targets. The combination of such physiological analyses at a systems level with informatics approaches based on drug structure and structure-function relationships offers a powerful new approach for drug discovery in the preclinical phase. The drug-discovery platforms have shifted toward targeted therapies, and drug-design technologies have greatly advanced to include biological therapies (antibodies and peptides), antisense therapies, and other methods to modulate protein-protein interactions, expression levels, or other interactions. However, more potential drug targets and drug-discovery technologies also means more possible efficacy failure and adverse effects that will set back the efforts to move drugs to the clinics. To enhance the rate at which new drugs are brought into the clinic, one will need to understand pharmacokinetic and pharmacodynamics of new drugs and use this knowledge in clinical-trial design. A systems-biology perspective and advanced computation methodologies provide an integrated basis for the understanding of the complex mechanisms of disease, targeted therapy action, and design combination therapies, and thus influence clinical-trial design to rapidly test new drugs in the clinic.
We thank Seth Berger for help with the drug-target analyses associated with Figure 1 and Sherry Jenkins for valuable feedback. This research is supported by the Systems Biology Center Grant P50-GM 071 558.
Potential conflict of interest: Nothing to report.