|Home | About | Journals | Submit | Contact Us | Français|
Kidney diseases manifest in progressive loss of renal function, which ultimately leads to complete kidney failure. The mechanisms underlying the origins and progression of kidney diseases are not fully understood. Multiple factors involved in the pathogenesis of kidney diseases have made the traditional candidate gene approach of limited value toward full understanding of the molecular mechanisms of these diseases. A systems biology approach that integrates computational modeling with large-scale data gathering of the molecular changes could be useful in identifying the multiple interacting genes and their products that drive kidney diseases. Advances in biotechnology now make it possible to gather large data sets to characterize the role of the genome, epigenome, transcriptome, proteome, and metabolome in kidney diseases. When combined with computational analyses, these experimental approaches will provide a comprehensive understanding of the underlying biological processes. Multiscale analysis that connects the molecular interactions and cell biology of different kidney cells to renal physiology and pathology can be utilized to identify modules of biological and clinical importance that are perturbed in disease processes. This integration of experimental approaches and computational modeling is expected to generate new knowledge that can help to identify marker sets to guide the diagnosis, monitor disease progression, and identify new therapeutic targets.
Most kidney diseases have complex pathogenesis involving the interplay of genetic and environmental factors. Diagnostic and treatment decisions are typically based on kidney histology, a limited set of serologic markers, and clinical manifestations of the disease. Currently, we do not have sufficiently sensitive or specific tests to detect early disease, predict disease progression, or monitor treatment response. Previous studies have identified many candidate genes involved in the pathogenesis of kidney diseases. Although many of these genes have important roles in disease progression and pathophysiology, they do not independently serve as reliable indicators of the clinical disease state and progression rate. Many important questions regarding kidney diseases remain unanswered: What are the temporal and causal relationships between glomerular and tubular injuries? What are the interactions and temporal relationships among podocyte, mesangial, and endothelial cell injuries? Why proteinuria does not always correlate with the decline of kidney function? What are the drivers of kidney disease progression? To address these complex questions, it is critical to understand the regulatory networks that underlie the functions of and interactions between various types of cells within the kidney. Such an understanding is best obtained from studying the regulatory networks within and between cells. Systems biology allows us to analyze these regulatory networks in a temporal and spatial manner. Detailed characterization of kidney diseases at the systems level could yield the needed integrative knowledge to better understand the molecular relationships underlying the pathogenesis of different types of kidney diseases, to develop more reliable biomarker sets for diagnosis and treatment, and to personalize the care of individual patients.
With the advent of next-generation sequencing technologies and methods to identify molecular species and interactions at the genome-wide scale, it is theoretically feasible to generate a global picture of cellular functions in the kidney from a molecular perspective and to link phenotype to the molecular networks that govern pathophysiological changes. This approach presents an opportunity to identify the key pathways and genes operative in disease and health and to characterize mechanistic details that contribute to the phenotype of the disease. In addition, the ability to collect genomic data on individual patients in a more affordable and expeditious manner allows us to better elucidate the role of genomic variation in disease manifestation under varying environmental conditions.
The field of systems biology has grown in the past decade because of the development of new experimental and computational tools that enable us to connect gene–cell– organ regulatory mechanisms at multiple scales and integrate findings from molecular and cell biology with kidney structure and function. Here, we provide an overview of systems biology approaches applied to the study of kidney diseases. We focus on the rationale rather than the detailed methodology of these approaches and provide salient examples in which systems biology approaches have advanced our understanding of kidney diseases. Limitations of current applications of systems biology in kidney disease are also discussed. Through this review, we hope that both clinicians and researchers could better understand the rationale and appreciate the potential of systems biology approaches, and that researchers could gain additional knowledge on the advantages, limitations, and applications of systems biology approaches in the study of kidney diseases.
A unique identifiable characteristic of experimental approaches used in systems biology is the capacity to measure multiple entities simultaneously. Genomics, transcriptomics, proteomics, metabolomics, and other genome-wide-scale experimental methods referred to as ‘omics’ technologies are capable of measuring changes in a large number of components and interactions to provide an overview of different conditions: healthy versus diseased state, before versus after drug treatment, and one cell type versus another. Data from these experiments are analyzed to identify groups of components and patterns of interactions that change upon perturbation. Databases that integrate and organize data from such omics-type profiling experiments, focused on the kidney, are available. The different regulatory layers of data can be classified as genomics, epigenomics, DNA/transcription factor regulomics, transcriptomics, proteomics, kinomics, and metabolomics (summarized in Figure 1). Specific examples of omic-level analysis related to the kidney are presented in Table 1. Major advantages and limitations of omic approaches are summarized in Table 2. Online repositories and catalogs of omic-level data sets and studies are listed in Table 3.
Great progress in genome-wide mapping of complex traits in humans was made by genome-wide association (GWA) studies and the admixture linkage analyses. A GWA study correlates allele frequencies in an unbiased manner at each of several hundred thousand markers spaced throughout the genome with trait variation in a population-based sample.1 In contrast, admixture mapping relies on the presence of extended haplotypes from parental ancestry to map a trait or disease that is more prevalent in one ethnic population than another. These approaches have been used to characterize chronic kidney disease,2 diabetic nephropathy,3,4 immunoglobulin A nephropathy,5 and focal segmental glomerulosclerosis (FSGS).6–8 For a detailed discussion about study design and execution of GWA analysis, readers are referred to Sale et al.9 and Iles.10
Most GWA studies have shown that disease traits are associated with small to moderate effects at several loci, and only a few loci with large effects have been identified. Similarly, when investigators from the Family Investigation of Nephropathy and Diabetes study used glomerular filtration rate as a quantitative trait to detect susceptibility loci for diabetic nephropathy in patients from different ethnic populations by admixture linkage analysis, they found that multiple loci of small effects were linked to glomerular filtration rate but none had large effects.4 Other studies that use admixture scan in end-stage renal disease patients of African ancestry have found that the MYH9 gene is highly associated with FSGS and human immunodeficiency virus-associated nephropathy.6,7 More recently, Genovese et al.11 examined the risk alleles linked to FSGS in a large cohort of African Americans, using the GWA approach, and identified strong association of FSGS with two alleles of the ApoL1 gene, which is located on the same region of chromosome 22 as MYH9. In fact, the statistical significance of the two ApoL1 variants was 35 orders of magnitude stronger than that for the MYH9 single-nucleotide polymorphisms. When the effects of ApoL1 variants were excluded, no residual association remained after correction for multiple single-nucleotide polymorphism testing.11
For most complex traits identified in humans, the sum of genetic effects comprises less than half of the total trait heritability. The remaining effects are thought to be due to untested rare variants, gene–gene, and gene–environment interactions. To improve the quality of GWA studies, the candidate locus could be re-sequenced in large cohorts, and eventually whole-genome sequencing of large numbers of subjects would be needed. An alternative approach is to integrate the functional biological knowledge with gene association analyses at a systems level. Systems genetics can be used to integrate large sets of genetic variants and/or genes with other functional data sets, such as protein–protein interactions and cell-signaling pathways, to further characterize the biology of the complex traits.12 The approach of identifying genomic loci, where genetic variants significantly affect gene expression patterns, is called expression quantitative trait locus (eQTL) mapping. Papeta et al.13 have used the eQTL approach to understand the mechanisms of kidney disease. They performed eQTL analysis of podocytes in HIV-1 transgenic mice to gain insight into the genetic susceptibility of human immunodeficiency virus-associated nephropathy.13 They found that transcript levels of the podocyte gene nephrosis 2 homolog (Nphs2) were heritable and controlled by an ancestral eQTL. In addition, Nphs2 expression was controlled by two eQTLs that localized to the nephropathy susceptibility intervals in the genome loci of their origin. These data demonstrate that transcript levels of Nphs2 and related genes expressed in the podocytes are networked, and suggest that the genetic lesions introduced by human immunodeficiency virus-associated nephropathy susceptibility alleles perturb this regulatory network and transcriptional responses to HIV-1
With the development of second-generation sequencing technology that enables massively parallel sequencing, it is now possible for individual laboratories to sequence the whole human genome. Although technically feasible, the cost and capacity to sequence the whole human genome of many individuals is still prohibitive for most laboratories. Although complete sequencing of the genome has the ability to discover the complete spectrum of genomic changes including base substitutions, rearrangements, and copy number, less comprehensive approaches with incomplete coverage—such as whole or partial exome and transcriptome sequencing—can provide the same depth of coverage in less time and reduced cost.14 Targeted sequencing approaches, such as whole or partial exome and transcriptome sequencing, can be applied to diseases in which the causative genes have been identified, or if the mutation is known to occur in the exome. Investigators can apply specific capture methods to select for the desired genomic region before massively parallel sequencing. As the human exome comprises approximately 1% of the genome, significant greater sequence coverage can be achieved with a lower cost than whole-genome sequencing. Whole-exome sequencing has not been applied to the study of kidney disease, but has been used successfully in identifying a novel coding variant as a result of a missense substitution in a gene, SLC26A3, known to cause congenital chloride-losing diarrhea15 and variants in the DHODH gene that causes Miller syndrome.16
Transcriptomics is defined as the analysis of mRNA expression patterns at the genome-wide scale. Such experiments have been the most widely applied systems-level approach so far. Changes in the mRNA levels are measured by microarray technology, which is being replaced by high-throughput sequencing of mRNAs, also known as RNA-seq. Efforts to combine transcriptomic data from biological samples annotated with clinical parameters have been made to generate a broad and unbiased view of disease processes and to identify transcriptomic fingerprints associated with defined clinical outcomes. Transcriptomic studies have been conducted to profile expression changes in diabetic nephropathy (DN),17,18 FSGS,19 lupus nephritis,20 chronic kidney disease,21 and glomerular diseases,22 as well as in animal models of DN,23 chronic kidney disease,21 lupus nephritis,24 and aging.25 Transcriptomic data sets have provided insights into the mechanisms of disease processes and identified novel genes responsible for the progression of kidney diseases. Transcriptomic analysis has the potential to define the molecular patterns of disease, allowing early diagnosis, prediction of prognosis, and response to therapy. The current use of transcriptomics in kidney disease research has several limitations. The heterogeneity of kidney tissue (different cell types), as well as the genomic variability of human samples, makes the interpretation of transcriptomic data challenging. Although attempts have been made to differentiate the compartmental gene expression in the glomerulus and tubulointerstitium by microdissection or laser capture, each of these compartments comprised several cell types— endothelial cells, podocytes, and mesangial cells in the glomerulus and tubular epithelial cells, endothelial cells, and infiltrating cells from the circulation in the tubulointer-stitium—and the gene expression profile of the different cell types in the compartments could be variable and not able to be distinguished from each other even with microdissection. In addition, changes in mRNA levels do not always reflect changes in the protein level or protein function due to multiple layers of regulation occurring from transcription to protein synthesis and post-translational modifications needed for understanding regulatory mechanisms.
Quantitative changes in mRNA levels do not directly explain how regulatory mechanisms are altered during disease to induce changes in gene expression, and in turn lead to changes in cellular- and tissue-level phenotypes. Genome-wide measurements of protein–DNA interactions by chromatin immunoprecipitation (ChIP) combined with quantitative measurements of transcriptomes are increasingly used to connect regulatory inputs with transcriptional outputs. Although genome-wide screening of binding sites by DNA binding proteins with ChIP had mostly relied on DNA microarray technology in the past, ChIP followed by high-throughput sequencing (ChIP-seq), which offers distinct advantages in increased specificity, sensitivity, and genome-wide comprehensiveness, is now being used more widely.26 The key challenge for ChIP-seq algorithms is to identify reproducibly true binding locations while including as few false positives as possible.27 The main advantage of ChIP-seq experiments is that identified binding sites for transcription factors can be directly correlated with transcriptional changes, connecting gene expression to the first layer of regulation. Transcriptional regulatory networks upstream of gene expression can be deduced by combining in silico promoter motif-based analyses combined with enrichment analysis of integrated results from ChIP-chip or ChIP experiments.28 This approach has been recently described by Ma'ayan and He.29
It is important to understand how the transcriptional regulatory networks are altered in kidney diseases. The transcription regulatory networks involved in the pathogenesis of DN were studied in a cohort of patients with progressive DN.30 By comparing the gene expression profiles in the tubulointerstitial compartment of patients with DN with those with mild disease, it was revealed that inflammation-stress response genes such as nuclear factor kappa B (NF-κB) target genes are differentially expressed. Systematic promoter module analysis of the upregulated NF-κB-dependent genes identified an enrichment of a specific NF-κB-binding module in the promoter regions of known NF-κB targets that were upregulated in the patients with progressive DN, but not in patients with mild disease or the controls.30 Using this specific NF-κB module, they were able to correctly predict other genes that are upregulated in the patients with progressive DN.
The term epigenetics broadly refers to regulatory mechanisms that arise due to differences in the state of chromatin and the proteins associated with it. Such differences arise because of covalent modifications such as histone acetylation or DNA methylation. These modifications are thought to alter chromatin density and accessibility of the transcriptional machinery to the DNA, thereby modulating the expression of genes. Epigenetic regulation has a crucial role in normal physiological development, as well as in pathological conditions. Epigenetics of kidney development has been reviewed previously.31,32 DNA methylation on epigenetic signatures of kidney-specific transporters was examined by Yagi et al.33 They demonstrated a central role for DNA methylation in kidney-specific expression of amino acid transporters.33 Bechtel et al.34 compared primary human fibroblasts from fibrotic kidneys with fibroblasts from non-fibrotic kidneys using a genome-wide methylation screen. This screen revealed 12 genes that were methylated in all seven tested fibrotic fibroblast samples and non-methylated in all tested non-fibrotic fibroblast samples. They identified RASAL1 as a candidate that may facilitate fibroblast activation and progression of renal fibrosis. These two studies demonstrate that environmental conditions influence the epigenetic state of a cell. Epigenetics provide an added layer of regulation that could mediate the relationship between genotype and environmental factors. Therefore, characterization of the epigenomic profile of kidney disease could help us understand the disease pathogenesis, as well as identify potential new biomarkers for disease diagnosis. Integration of DNA methylation profile with gene expression profile could give us a better understanding of gene regulation in kidney disease.
MicroRNAs (miRNAs) are a class of endogenous, small, non-coding 21- or 22-nt-long RNAs that have been implicated in the regulation of multiple biological processes. MiRNAs are critical in the maintenance of glomerular homeostasis and in the progression of renal disease.35 When Dicer, an enzyme that converts pre-miRNAs into functional small-interfering RNAs, was inactivated in mouse podocytes, the mice developed proteinuria and died subsequently from renal failure.36 Similarly, interruption of miRNA biogenesis in mouse podocytes resulted in proteinuria, podocyte dediffer-entiation, and crescent formation, leading to end-stage kidney diseases.35 Global miRNA expression profiling on samples prepared from the kidneys of C57BL/6 mice that underwent unilateral warm ischemia revealed nine miRNAs that are differentially expressed following ischemia–reperfusion injury when compared with sham controls.37 One of these nine miRNAs is miR-21, which may have a role in protecting renal tubular epithelial cells from death. Recent studies suggest that miR-192 may also regulate transforming growth factor-β/Smad3 signaling in the development of renal fibrosis.38–40
The expression profile of miRNA offers some important potential advantages over standard mRNA or other protein-based profiles. MiRNAs appear to be very stable in tissues and biological fluids, including urine.41 They are protected from endogenous RNase by virtue of their small size and perhaps by packaging within exosomes, which protect miRNAs from RNase degradation.41 In addition, the tissue-specific nature of miRNA expression makes them potentially ideal candidates for serving as biomarkers.42 The total number of human miRNAs, estimated to be between 700 and 1000, is considerably smaller than the number of protein-coding mRNAs (>22,000).43 MiRNAs have been detected in urine and could be used as biomarkers for kidney disease.44 These urinary miRNAs may be filtered directly from the kidney or the urinary tract; alternatively, miRNAs from detached cells or packaged in exosome may also contribute to the miRNAs detected in the urine. However, there are still major challenges to overcome before expression profiling of miRNAs can be used widely. For instance, miRNA biology and regulation are still unclear and bioinformatics prediction of miRNA-target pairs is still unreliable.
The use of proteomics in kidney disease research has grown substantially over the past decade. The proteomics approach has been applied to study renal physiology and this has been reviewed by Hoorn et al.45 Proteomics technologies have been used to identify promising new biomarkers for acute kidney injury and chronic kidney disease. Proteomics may also help to identify new uremic toxins in renal failure patients. Proteomic analysis has been performed to profile normal human and rat renal cortices,46 human renal glomerulus,47,48 urine,49,50 and urinary exosomes.51 Many of these studies have tried to identify a single biomarker for kidney disease. This has not been successful because of the biological variability and complex pathophysiology of kidney diseases. Development of classifiers based on sets of protein biomarkers may be more predictive for kidney disease progression.
Urinary proteins can originate from the filtering of plasma, vascular damage and/or leakage, alterations in tubular metabolism, and the reabsorption or direct shedding and/or secretion of proteins by cells throughout the urological tract. Urine represents an attractive source of potential biomarkers in renal diseases because of the non-invasive nature, easy collection, and low amount of proteins and peptides that are required for analysis. Numerous protein biomarkers have been identified in both health and disease. Urinary proteomics has been used to identify the early biomarkers of acute kidney injury.52 Specific protein panels from urine could help in early diagnosis of acute kidney injury and assess the severity and duration of the disease. However, the sensitivity and specificity of these markers needs to be validated in large clinical studies.53 Proteomic studies have also been conducted to identify biomarkers to predict the development and progression of diabetic kidney disease (reviewed in Merchant and Klein54 and Rossing et al.55) and immunoglobulin A nephropathy.56 Proteomics has also been used to study patients with other glomerular diseases, patients on dialysis, or recipients of kidney transplant.57,58 The proteomic resources of human kidney tissue and urine have been reviewed recently by Yamamoto.59
Proteomics-based approaches have generated large amount of data in clinical medicine. These data could be used to discover novel biomarker sets and gain mechanistic insights of disease origins. However, there are substantial experimental and computational challenges. Methods used to concentrate the proteins and remove salts often affect the yield. Numerous methodological and biological variables can influence peptide profiles.60 One major issue is the standardization of procedures for clinical applications. Given the large numbers and varying abundances of different proteins in biological samples, no single experimental approach appears adequate for the reproducible characterization of the whole proteome state. The nature, number, and quality of clinical samples, temperature and length of storage, and nature of sample receptacle are all critical. The Human Kidney and Urine Proteome Project have been developing a standardized protocol for urine collection and storage (http://www.hkupp.org). Standardization of techniques is required to obtain more reliable proteomics data to study kidney disease.
Whereas proteomics profiling tracks the global expression of proteins, metabolomics profiling characterizes and quantifies metabolites. Only a subset of proteins is responsible for processing metabolites, and thus the information from proteomics and metabolomics are complimentary, whereas the combination of these approaches often provides a better appreciation of cells', organs', or organism's molecular state. The advantage of metabolomics is that it could reveal changes in both phenotype and genotype.61 Computational approaches that would integrate metabolomics and proteomics data sets may provide more reliable biomarker sets.
Metabolomics has been used to study kidney diseases for the last several years, although the data are still quite limited. Studies of metabolite profile in blood, kidney tissue, and urine have been used to identify early biomarkers for kidney disease.62,63 It was reported that changes in urine metabolite profile such as glucose, amino acids, and fatty acids could be used as early biomarkers for cisplatin-induced acute kidney injury.62 Metabolomics is also used to identify biomarkers to distinguish acute rejection from acute kidney injury in kidney transplant patients.63 However, the application of metabolomics in kidney disease is still at an early stage. Validation of these metabolic biomarkers with respect to reproducibility, sensitivity, and specificity will need to be performed. Clinical applications are currently limited because of technical challenges.
Computational systems biology is an evolving field that encompasses a range of approaches from statistical correlation-based methods to network topology analyses, to dynamical models of biochemical and physiological functions, to detailed stochastic simulations of few components, to multiscale dynamical models that describe molecular networks in space and time and physiological functions of organs and organisms. The large data sets collected at the different layers of regulatory interactions require the development of databases and modeling algorithms capable of integrating such data to convert information into knowledge.
Genomic studies, transcriptional profiling, genome-wide ChIP analyses, miRNAs profiling, proteomics, phosphoproteomics, and metabolomics produce high-dimensional data sets. Commonly, these data sets are analyzed by statistical methods to cluster genes and proteins based on correlated expression, and then the clusters are linked to functional categories.20 Clusters identified based on expression similarity64 are commonly linked to biological themes where sets of co-regulated or co-expressed genes are linked to prior biological knowledge to assess enrichment for pathways or other properties.65,66 Correlated changes in the expression or activity of molecular components are also linked with the physiological or pathophysiological effects at the organ and organism levels. Such analyses serve as the starting point for understanding the molecular consequences of the different experimental conditions or disease states. This type of analyses produce ranked lists of components that vary with the underlying pathophysiology. Often it is discovered that such lists of components share biological themes that can explain observed outcomes. For example, Gene Ontology enrichment analysis67 is used to score how components from observed changes in expression belong to the same biochemical function or are present in a specific cellular or extracellular location. However, such analyses do not provide a full understanding of the mechanisms underlying disease origins or progression. Lacking in this type of analyses is the regulatory relationships between components in ranked lists of differentially expressed genes. However, such relationships can be deduced statistically using Bayesian approaches68,69 or by linking list members to networks developed from prior biological knowledge65,66,70,71 (Table 3 contains a list of tools and databases for functional annotation and pathway and interaction network analysis). The next step is to integrate such networks toward mechanistic understanding. To obtain such mechanistic understanding, there are two broad challenges that are currently actively being pursued: how do we integrate data from different type of profiling experimental approaches at different regulatory levels of cellular regulation? And how do we combine different classes of genome-wide experimental data with clinical measurements?
Lists from omic studies can be used to reconstruct models of cellular networks. This can be done using statistical methods. Methods used to develop networks directly from expression data include methods such as Bayesian networks68,69 or information theory-based approaches.72 Networks can also be created using prior knowledge such as protein–protein interactions or cell signaling networks developed from biochemistry and cell biology literature.65,66,70,71 In addition, integrating data from different types of profiling experimental approaches, such as those described above, can be achieved by anchoring genes within prior knowledge networks.73 A similar approach allows the construction of molecular networks that connect diseases to genes based on GWA studies, miRNAs to the mRNAs they target,74 protein/DNA interactions based on ChIP analyses,28 protein–protein interactions,65,66,70,71 metabolic pathway interactions, virus/host protein interactions,75 and drug–target interactions76 to each other to form heterogeneous networks made of different types of nodes and different types of links (Figure 2). Such networks can be analyzed using clustering algorithms to identify modules altered in various kidney diseases. Such modules can serve as sets of disease biomarkers, and components of the modules can become potential drug targets. Network analysis algorithms can be used to identify such modules robustly (see review by Ma'ayan77). Once such modules are identified, quantitative dynamical modeling approaches can be applied to further understand the dynamical behavior of such modules, and how the dynamics are altered during pathophysiology.
The major functions of the kidney include clearance of toxins and maintenance of fluids, electrolytes, and acid–base balance. Normal kidney function is maintained by coordinated regulation at different levels of organization (Figure 3). At the organ level, the heart perfuses the kidney, which is regulated by the neuro–hormonal system. At the nephron (a single functional unit of the kidney) level, autoregulation of renal blood flow, tubuloglomerular feedback loop, and nephro–vascular exchanges in the medulla are all required to maintain normal nephron function. At the cellular level, normal structure and function of endothelium/glomerular basement membrane/podocyte are essential for the maintenance of the glomerular filtration barrier, and polarized tubular epithelial cells are required for reabsorption and secretion of fluid and electrolytes along the nephron.
There are several types of computational models of kidney physiology with varying level of organization: multiorgan models, models of kidney function at the organ level, models of the tissue compartments within the kidney, and models of kidney cell types. The SAPHIR (a Systems Approach for Physiological Integration of Renal, Cardiac, and Respiratory function) is the best example of a multiorgan model.78 This is a model of body fluid homeostasis and blood pressure regulation and has been reviewed by Thomas.79 Karaaslan et al.80 developed an integrated cardiovascular/renal model with emphasis on the role of renal sympathetic nerve activity. In the 1960s, Guyton et al.81 were the first to develop a global model of kidney function and the overall regulation of blood pressure. Ikeda et al.82 extended the Guyton model to the regulation of acid–base balance. The classical models from Guyton and Ikeda constitute the basis for current development of a core model for fluid and solute regulation. Several whole kidney models have been also developed to study the genesis of the medullary osmotic gradient.83,84
Many computational models have been developed to study different aspects of the local renal physiology, including membrane and epithelial transport of fluid and electrolytes for different segments of the nephron,85 autoregulation of renal blood flow,86,87 and tubuloglomerular feedback.88 At the glomerular level, many modeling studies analyze factors involved in single-nephron glomerular filtration rate89 and structural defect for reduced glomerular filtration capacity.90 Glomerular transcriptional profiles and protein–protein interaction network in the glomerulus have been used to identify molecular signatures responsible for these physiological changes.91 Models of different tubular segments have been developed to address absorption and secretion (reviewed by Weinstein92).
Structural models of various aspects of the renal anatomy have been described. Because of advances in imaging technologies, detailed three-dimensional reconstructions of the nephron and surrounding blood vessels can be obtained from serial sections.93,94 Nordsletten et al.95 reconstructed rat renal vasculature from micro-computed tomography scans. Other detailed three-dimensional glomerular reconstructions have also been published.96,97 Recent studies have used scanning electron microscopy with a high-sensitivity detector to image the deepest regions of the filtration slits and visualize the porous structure of the slit diaphragm.98 In the future, it will be important to connect these morphological characteristics to physiological functions and in turn to cellular regulatory networks.
The application of systems biology approaches to understand renal physiology is now termed as renal physiome. Technological advances allow us to obtain quantitative measurements of renal physiologic parameters and detailed morphology. Development of detailed computational models would help to integrate quantitative information of renal physiology at different scales from renal cells to the whole kidney. Independently, high-throughput genomics and proteomics data sets, as described above, are being collected at the tissue and cellular levels. Integrated models that merge genomic and proteomic data with physiological data at the cellular level are needed. Such models will have to successfully merge the network analyses used to study the omics data sets with differential equation-based dynamical analyses used for physiological models. The development of simulation environments, such as the Virtual Cell, makes this approach feasible.99 Such models will serve as the base for building the kidney physiome. The multiple levels of models that need to be considered during the construction of an overall multi-scale computational model of the functional kidney (i.e., the kidney physiome) are summarized in Figure 3. A major challenge in this field is how to integrate the multiscale data. There is still a large gap between the functionality and the true ability to use ontological data at the organ level to inform molecular pathways. Furthermore, there is a lack of strategies for integration of quantitative biological sources stored in different standards.
Computational modeling of the multiple scales of disease processes is at the core of systems biomedicine (Figure 4). To capture the quantitative properties of diseases (i.e., severity, manifestation of various signs and symptoms of the disease, and rate of progression), investigators in the field of oncology have linked clinical data from patients to molecular signatures identified from malignant cells or tissues.100 High-quality data on clinical and pathological features and specific molecular makers of diseases, which permitted the linking of phenotype-to-genotypic variation in the study of cancer biology, is still largely lacking in the field of nephrology. One notable exception is the European Renal cDNA Bank-Kröner–Fresenius Biopsy Bank, which has collected renal biopsy samples, as well as clinical and histology reports. Data compiled by the European Renal cDNA Bank-Kröner–Fresenius Biopsy Bank have been used in the study of several kidney diseases (reviewed in Neusser et al.101). However, none of the published studies have fully linked clinical data to molecular makers in a systems-wide manner. Some studies have used one type of clinical data (for example, histological diagnosis, severity of renal impairment, rate of disease progression, extent of proteinuria, or specific lab values) and correlated with data on gene expression.102–104 Although this is a necessary starting point, the next step is to use a systems approach to profile the relationship of multiple clinical data points with omic data sets (for example, genome, transcriptome, or proteome) and construct a genotype–phenotype interaction map as it had been described for oncologic diseases.105,106 This relationship between clinical characteristics and molecular features can serve as bookend for the development of the kidney physiome and increase our understanding of the regulatory networks underlying the disease processes.
The major challenge in the field of systems medicine is the imbalance between the large number of omic data sets and the small sample size of patients with a well-defined disease category. This causes significant issues for data analysis and interpretation, leading to a so-called ‘multiple testing’ problem. To overcome this issue, it is critical to develop national and international collaborative studies for collecting large number of samples with strict disease classification criteria. An example of this approach is the Nephrotic Syndrome Study Network, which is a major multicenter initiative funded by the National Institutes of Health to enroll patients with nephrotic syndrome (http://clinicaltrials.gov NCT01209000). The aims of the project are to investigate the underlying causes of the FSGS, minimal change disease, and membranous nephropathy, identify effective treatment for each disease, and develop meaningful approach to classify these disorders in order to inform the selection of the most effective treatments. Clinical data and biological samples (kidney biopsy, urine, and blood) will be collected prospectively. Transcriptomic profiles generated from the biological samples will be used to identify transcriptional networks and classify participants into distinct molecular subgroups. Such an approach to prospectively enroll patients, gather biological samples, and collect clinical data will be an invaluable resource for future investigations as newer technologies become available. With this we can expect to fill in the voids in our molecular understandings of a group of clinical disease categories that do not necessarily share the same molecular underpinning.
In the current practice of renal medicine, nephrologists formulate a clinical diagnosis, treatment plan, or prediction on disease progression using a rather restricted set of clinical parameters. These clinical findings are direct and indirect reflections of the underlying molecular and cellular processes. Different diseases often share similar clinical manifestation and cannot be distinguished from each other using the currently available knowledge and tools. To better characterize disease processes, software systems that are capable of handling the analysis of multivariant data and predict disease classification by machine-learning approaches have been developed. The goal is to achieve further characterization and classification of disease processes using currently available clinical and molecular data sets as biological classifiers. Instead of sorting patients into broad clinical categories, such as responder versus non-responder to a particular treatment, the systems biomedicine approach is to refine the categorization of diseases and be able to distinguish the genetic and phenotypic individuality of any given patient and tailor a specific therapeutic plan. Systems level analysis may identify a particular set of single-nucleotide polymorphism variants, transcriptomic profile, and urine proteomic and metabolomic characteristics that could be translated into clinical tests to help identify patients who are more or less likely to respond to a particular treatment or have a faster or slower progression of renal function loss leading to more personalized medicine.
Another major challenge in the field of chronic kidney disease is the lack of the early hard outcomes for clinical studies because disease progression usually takes decades. Biomarkers derived from omics studies could help predict the response of drugs at an early stage without waiting for the development of ‘hard outcomes’ such as doubling of creatinine or progression to end-stage renal disease, which might take years. Validation of omics biomarkers as early predictors of disease in prospective studies is required but hard to achieve, given the relatively low incidence of glomerular diseases, with the exception of DN. Therefore, the national and international collaborative studies are needed to study these diseases with low incidence.
Treatment options for kidney disease are currently quite limited. Most primary glomerular diseases are still treated with steroids and immunosuppressive medications. Long-term use of these medications can cause significant side effects, including severe infections and malignancies. Attempts have been made to develop more specific drugs to treat diabetic kidney disease. Although multiple phase II clinical trials had shown promising results, the only successful examples are studies using angiotensin-converting enzyme inhibitors or angiotensin receptor blockers,107–111 whereas many trials including aminoguanidine112,113 and sulodexide (http://clinicaltrials.gov identifier NCT00130208) failed in phase III clinical trials. A large body of evidence suggests that transforming growth factor-β has a key role in kidney fibrosis. Many drug companies have tried to develop anti-transforming growth factor-β agents in the last decade. However, they are limited to phase II studies. It is generally believed that podocyte injury is an early event leading to glomerular disease. Therefore, several groups have tried to develop specific drugs to prevent podocyte injury. We have recently found that retinoic acid can protect podocytes from injury in an animal model of human immunodeficiency virus-associated nephropathy.114 However, these approaches are still limited to experimental models of disease. Overall, traditional approaches to discover drugs to treat kidney disease have not been largely successful. This lack of success indicates that a different approach is needed.
Systems pharmacology is an alternative approach to discover drugs to treat complex diseases such as kidney diseases. Systems pharmacology views therapeutic targets of drugs as parts of cellular networks that control physiological responses.115 This approach is useful in understanding drug action both therapeutic and adverse. We have used this approach to study cardiac arrhythmia.115 Network analyses of arrhythmias suggest that the genes involved in arrhythmias form an identifiable region (subnetwork) within the human interactome. Identifying such disease-specific subnetworks within the human interactome and determining whether key components within these subnetworks are expressed in the tissue of interest are potential approaches to identify new drug targets and disease-specific genes. Omics studies that are used to develop biomarker sets for disease progression can also be used to identify disease-specific subnetworks within the human interactome, which can serve as the basis for identifying new drug targets. In addition, such subnetworks can also be used to repurpose current drugs for the treatment of kidney diseases if their therapeutic targets fall within the disease-specific subnetworks. Such potential drug targets could then be computationally validated using dynamical models to determine whether drug-dependent change in activity of the target will affect overall physiological function. This type of computational prescreening could help reduce the cost of developing new drugs for kidney diseases and predict adverse events in susceptible populations.
It is increasingly clear that diseases, such as most kidney diseases, with complex pathophysiologies will require that we understand the physiology at the level of molecular and cellular networks at the genome-wide scale, both to predict disease progression and to develop drugs for treatment. Systems biology approaches can be used to develop an understanding of how alterations in cellular components and interactions lead to pathophysiology. Integrated analyses of omics profiles derived from the genome, transcriptome, proteome, and metabolome levels is the first step in realizing a systems view of cellular processes. However, merging these different experimental data sets, for example, mRNA concentration and protein abundance profiles, is not straightforward, as a direct overlap of differentially regulated/abundant features, resulting from transcriptomics and proteomics, is often limited. Furthermore, it is important to consider the regulatory mechanisms of transcription by post-translational modifications and protein degradation, and epigenetic regulation. Some studies suggest that the analysis of transcriptomic and proteomic data on the level of protein interaction networks may be a way of identifying the link between the two data sets.116 Changes in metabolite levels do not necessarily parallel changes in the transcriptome or the proteome because changes in the level of a particular transcript do not necessarily result in an instantaneous change in enzymatic activity. Therefore, it is essential to investigate the changes in gene transcripts, proteins, and metabolites of the same individual over multiple time points in order to understand their regulatory relationship at a systems level. Although this is a challenge that needs to be addressed within the context of basic systems biology, such integration is critical for the development of systems biomedicine as a clinically useful discipline.
A major challenge in unraveling the complexity of disease mechanisms is to integrate GWA studies with gene expression patterns in order to establish the mechanistic connections between the measured genotype and observed phenotype. For renal disease, the availability of both tissue- and disease-specific expression data sets makes such a strategy feasible. Computational approaches in systems biology such as network analysis will have a major role in developing this genotype-to-phenotype connection. Connecting experimental data at different scales from genes to cellular networks to organ function will help to identify molecular signatures for renal physiology and physiopathology. Integration of omic data sets obtained over extended time periods during the progression of kidney diseases could provide the mechanistic understanding of how multiple factors coordinately regulate the induction and progression of kidney diseases. Integration of these dynamic changes at molecular levels with clinical parameters will be the key step in the process of identifying the biological basis and developing biomarkers and therapeutic agents of kidney diseases.
When comparing large data sets with millions of data points, the likelihood of detecting a false positive is high. One of the major challenges is to identity and sort out changes on the molecular and physiological levels that actually have an impact on the clinical outcome of the individual. Qualitative and strict statistical interpretation of findings needs to be applied first in order to avoid spurious findings and interpretations.
There is no doubt that there is considerable work that needs to be done at both the experimental and computational levels to successfully apply systems biology to treat kidney diseases. Challenges range all the way from developing appropriate standard techniques to obtain patient samples to the development of complex multiscale models that explicitly integrate genomic information with clinical data. In spite of these challenges, the systems biology approaches described here provide a clearly identifiable path by which our knowledge of biological mechanisms can be used to better diagnose and treat kidney diseases (Figure 4).
The focus of future systems biology studies in kidney disease is to identify reliable early biomarkers for molecular diagnosis and prediction of disease progression. The discovery of the reliable biomarkers requires massive efforts from nephrologists to collect a large number of samples from patients with well-defined disease category. To achieve this, we need to develop international collaboration among the physicians, epidemiologists, molecular biologists, statisticians, and systems biologists with computer science and mathematical backgrounds. The discovery of new biomarkers will improve the design of clinical trials and identify patients at an earlier stage of the disease. Integrating the large omic data sets with drug and disease databases could guide the prediction of drug efficacy and side effects. These systems approaches could eventually help us reduce the chance of fails when we move clinical trials of new drugs from phase II to phase III. Advancements in this field will eventually lead to personalized medicine to treat patients with different subtypes of kidney diseases at different stages of progression.
JCH is supported by NIH 1R01DK078897 and VA Merit Award; JCH and AM are supported by NIH 1R01DK088541 and 1RC4DK090860; AM is supported by NIH RC2LM010994-01. RI and AM are supported by the Systems Biology Center grant 5P50GM071558; RI and JCH are supported by NIH 5R01DK087650. PYC is supported by NIH 5K08DK082760.
Disclosure: All the authors declared no competing interests.