It is currently understood that single-gene disorders are rare and most genetic diseases are caused by the concerted action of many genetic abnormalities, along with other possible nongenetic influences such as environmental factors, hormone levels, age, weight, and diet. Consequently, the treatment of diseases with a single, “magic bullet” therapy has proven largely unsuccessful. The current understanding of pathophysiology is complicated by the large amount of information that has amassed over the years from biochemical and cell-biological studies, and a systems view is necessary to integrate all the factors that control the regulatory networks involved in orchestrating the physiological and pathophysiological responses.
In order to gain a systems-level, molecular perspective of disease, a network view of the relevant physiological function is essential.
Each physiological process has an underlying signaling network of chemicals, hormones, native ligands, protein receptors, ions, enzymes, transcription factors, DNA with associated epigenetic mechanisms, and RNA, both regulatory and protein encoding. Together these components control the level of various cellular components to modulate biochemical reactions, electrical signals, force generation, transcription, and protein translation. These cellular processes can occur with different kinetics, simultaneously and/or at different times, and at varying levels of magnitude. Therefore, each physiological function or phenotype has a very complex network of signals controlling it. To represent each physiological process in a parsable network, each protein, ion, or chemical becomes a “node.” and each interaction between 2 nodes (binding, chemical reaction, or other interaction) becomes an “edge” (, ). Signaling networks can be represented as undirected graphs wherein interactions are specified but without hierarchy, or as directed graphs where the effect of 1 node on another is specified by the direction of the arrow. More specifically, directed graphs can have sign specification, such as positive for activation, negative for inhibition, or neutral interactions for scaffolds, which are represented differently by arrows, symbols, or colors. The hypothetical network in is directional, with activating interactions represented as arrows, inhibitory interactions as plungers, and neutral interactions as lines. In biological systems, signaling networks are not unidirectional
9; instead, they are highly clustered. Certain nodes, termed “hubs” (), are highly connected,
10 thus allowing many more components to be connected to other subnetworks. On the other hand, a system can contain “islands” (), sets of nodes not connected to the overall network. Several types of motifs exist within regulatory networks, such as negative feedback loops, positive feedback loops, and feed-forward loops ().
9 These characteristics, as well as the high clustering coefficient, are attributed to the scale-free (a connectivity distribution that follows a power law
11) and redundant properties of biological networks. These properties also allow for network perturbation without complete loss of function. Thus, a single imbalance in the regulatory network of a biological system, such as an increased protein level or biochemical reaction rate, does not necessarily lead to a disease state. Instead, within the regulatory network of a disease system, the signaling networks underlying the pathophysiology are likely to be persistently perturbed at more than 1 point (edge or node).
| Table 1Definitions of Key Terms in Systems Pharmacology. |
The genomic irregularities associated with disease have been studied, through genome-wide association and positional cloning studies, which have become more efficient with the sequencing of the human genome. More recently, numerous genomic, short hairpin/small interfering RNA (shRNA/siRNA), proteomic, and phosphoproteomic studies have focused on the gene-expression signatures and protein-signaling pathways associated with a particular disease ().
12–17 The plethora of data on the genetic mutations and variations produced by studies such as these has been compiled into several databases, such as the OMIM database and the Catalogue of Somatic Mutations in Cancer (COSMIC) database. These global databases help to identify genes, proteins, and other cellular components related to the origin or progression of a specific disease. These components can be used to generate lists of “seed nodes” that enable us to computationally construct various types of networks by integrating knowledge about gene-disease associations and/or protein interactions reported in biomedical literature. Through information about the properties of disease genes as nodes in a network, previously unrelated relationships between disease pathways and cellular functions can be identified, which can serve as a first step in identifying drug targets.
18–20 | Table 2Systems Biology Studies That Elucidate Mechanisms of Pathophysiology and Drug Action. |
An interactive, computable database, the Connectivity Map,
21 finds correlations between gene-expression signatures and the sets of proteins involved with the action of a class of drugs or with a particular disease. The initial Connectivity Map project analyzed the genome-wide messenger RNA (mRNA) expression profiles of 1 cell type after treatment with 164 small molecules and perturbagens (a subset was tested in 3 other cell lines). Instead of using traditional hierarchical clustering for analyzing the genetic data, the authors developed and adopted a method termed “gene set enrichment analysis,” a nonparametric, rank-based pattern-matching strategy,
21 with the idea that regardless of the cell type, the gene-expression data specific to perturbation by a small molecule could be detected. The gene-expression signatures could be queried to search for commonalities among expression signatures for seemingly unrelated drugs or drug classes. This tool aids in the empirical understanding of current drug action and the discovery and characterization of new drug entities.
22In another example, Goh
et al.
19 examined gene-disorder association data in 2 networks: (1) the human-disease network, which treats human diseases as nodes, connected based on the associated genes they have in common, and (2) the disease-gene network, which connects disease genes according to whether they are associated with the same disorder. The data were used to construct a bipartite interaction network using 1284 known disorders and 1177 known disease genes from the OMIM database. The diseases were further broken down into 22 classes. Both networks were highly connected and had a “giant component,” a hub-like area where many nodes were connected to one another. In the human disease network, cancer and related disorders and neurological disorders were in highly connected, heterogeneous hubs, whereas metabolic, skeletal, and other disorders were in lesser-connected “islands.” With a wide distribution of gene associations, most diseases associate with a few genes, whereas some disorders associated with a much higher number of genes (30 to 50). There were numerous islands, more notably in the disease-gene network. It was previously demonstrated that disease-related genes are hubs or have a higher number of interactions than nondisease proteins.
The majority of the gene list used in the human disease and disease gene networks contained nonessential genes (“essential” defined as being embryonic/postnatal lethal).
19 It was observed that with essential genes excluded from the disease-gene network, the connectivity greatly diminishes, indicating that the higher connectivity was observed solely due to the high connectivity inherent to the essential genes. Furthermore, the expression patterns of disease genes were asynchronous with genes essential for cellular functions, and the disease genes had little overlap with housekeeping genes. This indicated that disease genes are often peripheral, with 1 important exception: somatic mutations found in cancer, which did not fit the analysis described above.
In another analysis of the properties of disease genes from a systems perspective, Zhong
et al.
20 showed that genetic mutations that perturb networks at an edge (edgetic: mutated nearly full-length or point-mutated full-length gene products) rather than at a node (null: truncated gene products) are more heavily responsible for the phenotypic manifestation of the diseases.
20 Edgetic perturbations are more commonly found in mutations that span multiple diseases and are caused by in-frame mutations and the expression of a nearly full-length gene product, which lacks an interaction with a neighboring node (). In contrast, node removal or a null mutation includes severely truncated gene products or other mutations that destabilize protein structure and equate to the removal of the protein with respect to its function and interactions (). Edgetic perturbations have potential consequences at more than 1 node in the network. Furthermore, the authors cited examples where different mutations of the same gene product affect different edges, and each edgetic mutation is associated with a different disorder.