To evaluate the potential for computational systems biology analysis of host:pathogen interactions (the interactome) to be used as a tool for enhanced rational design of vaccines, each host/pathogen interaction condition was modeled and scored 219 known metabolic and signaling pathways and 1620 biological processes (gene groups associated with Gene Ontology (GO) terms) at four time points. DBGGA was employed to identify the perturbations between pathogen conditions for pathways, GO categories, and genes. The DBGGA method generates Bayesian log likelihood scores that are normalized and transformed to a z-score equivalent (the Bayesian z-score) so that all pathways and GO categories across all host/pathogen conditions can be equivalently compared and assessed for significance.
DBGGA Gene Ontology Analysis
A DBGGA analysis was conducted for gene GO categories. For each pathogen condition, 1620 biological process GO categories were scored. Each condition produced its own unique set of highly scored GO functions, but for comparison purposes, we chose a small subset of highly perturbed categories to illustrate the different temporal responses as shown in . illustrates the comparative analysis of gene scores for just the phagocytosis GO term category. The gene ontology group scores show a very diverse pattern over time. As can be seen, the induction or suppression of groups of genes allows us to identify specific biological process groups that define the pathogenicity biosignatures of each pathogen. Individual gene patterns within the groups can be further compared as show in . For example the gene encoding ADORA1 (adenosine A1 receptor) is up modulated in BMEL for at all four time points, but is not significantly expressed in MAP or STM. Comparative pathogenicity can provide important insights into the mechanistic differences and guiding the research biologists to identify unique and common mechanisms that may be new targets for immunotherapeutic drugs or indicators of immunogenicity of novel vaccine candidates. Such comparative modeling could also be utilized to compare the effectiveness of vaccine candidates across multiple species.
DBGGA Pathway Analysis
Of the 219 signaling/metabolic pathways scored, we focused on a subset of immune response related pathways as listed in . This figure shows a heat map comparison of pathway Bayesian z-scores between pathogen conditions over time post infection. There were considerable differences between the host response profiles. MAP had strong early (30 minute) induction of the majority of its pathways and appeared to reverse to a more suppressive state by 240 minutes. STM's early response indicated mild perturbations at 30 minutes that increased over time until several pathways were strongly induced by 240 minutes. BMEL was more strongly suppressive for the majority of pathways over time. At early times (30, 60 minutes) there were a few commonly induced pathways: Antigen Processing and Presentation, B Cell Receptor Signaling, Fc epsilon RI Signaling, Hedgehog Signaling, and Natural Killer Cell Mediated Cytotoxicity. In contrast, only ECM-receptor Interaction, Apoptotic Signaling and Apoptotic DNA Fragmentation had similar suppressions at 30 and 60 minutes. Interestingly, there was no single pathway at later times (120, 240 minutes) with similar perturbed states, implying that the host defenses have divergent biosignatures against the various virulent mechanisms presented by the pathogens.
Significant divergent responses between conditions were observed for MAPK Signaling and Regulation of Actin Cytoskeleton, for example. For MAP, the MAPK pathway reversed from induced to suppressed, while STM increased induction and BMEL maintained a suppressed state. The MAPK Signaling Pathway has been implicated in bacterial pathogenesis for a number of pathogens such as Salmonella enterica
serovar Typhimurium (Hobbie, Chen et al. 1997
spp. (Ruckdeschel, Harb et al. 1998
), Listeria monocytogenes
(Tang, Sutherland et al. 1998
), and Mycobacterium
spp. (Schorey and Cooper 2003
). For the MAP condition, the Regulation of Actin Cytoskeleton pathway was induced within the first 30 minutes and became suppressed after 240 minutes, while for STM, this pathway became more strongly induced over the course of 240 minutes. The BMEL condition had a biphasic response from suppressed to induced and back to being suppressed over the 240 minute time course. The manipulation of the Actin Cytoskeleton pathway by STM to invade host cells has been well established (Guiney and Lesnick 2005
), but not as well documented for MAP or BMEL.
The MAPK pathway was selected as a potential candidate gene for more detailed discussion with regard to gene perturbations, mechanistic interpretations, and gene knockout simulation. is a heat map of significantly perturbed genes for the MAPK pathway by pathogen condition. In this figure, the genes are sorted in order of highest up modulation to lowest down modulation, and for a gene to be included in this figure, a Bayesian z-score>|2.24| at any one time point was required. The Bayesian z-score > |2.24| reflects 99% confidence in the data. It is easy to observe that the perturbed genes and their expression patterns are quite different between conditions. Surprisingly, of the 171 measured genes in this pathway, only two genes in were found to be commonly perturbed across all three pathogen conditions: 1) IL1A, which encodes interleukin 1 protein involved in various immune responses, inflammatory processes, and hematopoiesis; and 2) RASGRP1, which encodes a protein characterized by the presence of a Ras superfamily guanine nucleotide exchange factor (GEF) domain that activates the Erk/MAP kinase cascade and regulates T-cell and B-cell development, homeostasis and differentiation. The perturbation of IL1A and RASGRP1 is consistent with genes involved in immune response, but the expression patterns for these two genes vary significantly between pathogens.
Simply comparing and contrasting the expression patterns of perturbed genes was inadequate for deciphering the MAPK pathway response dynamics. Clearly, the uniqueness of the MAPK pathway responses suggested that very different invasion/evasion mechanisms have evolved for each pathogen. More sophisticated methods are needed to identify potential points of host response disruptions. This is done by interrogating the trained DBN model for the MAPK Pathway for genes that exceed threshold Bayesian z-scores>|2.24| (“mechanistic genes”) and gene-gene network relationships (arcs). For example, shows the visualization of the MAPK pathway network. The network can be employed to visualize several key features that would otherwise be difficult to discern by looking at spreadsheet lists of genes. For example the state of gene modulation is distinguished by color-coded nodes. The state of upstream and downstream genes can be easily identified. Various threshold levels can be modified to identify significantly perturbed genes (annotated with orange circles, ). The strength of correlation between gene pairs is indicated by the color and thickness of the arcs connecting the genes.
In , we show a list of 20 specific gene-to-gene relations associated with the MAPK pathway () having strong positive and/or negative arc weight correlations. We normalized the DBN arc weights to allow equivalent comparison to other pathway gene-gene relations. In this arc weight table, a few significant relationships are numbered in the table and on the network () with an encircled number and arrow pointing to the corresponding arc. It is hypothesized that virulence factors from each pathogen can have different disruptive influences on the host's MAPK Signaling Pathway and that such disruption can be used to identify pathogenic mechanisms unique to each pathogen, thus providing a rationale for development of deletion mutants of the corresponding pathogen virulence factors as potential vaccine candidates. For example, the relation arc TRAF2->FLNA had a strong positive weight (correlated) for STM-Host (0.241) and for the BMEL-Host (0.204) while MAP-Host had a large negative weight (anti-correlated) of −0.17.
Arc weight correlation table showing highly correlated (positive weights) or anti-correlated gene-gene relationships (negative weights) learned from the model training.
The reversal of gene-to-gene arc weight of MAP-Host may indicate a disruption of either TRAF or FLNA gene by virulent factors of the pathogens, identifying potential novel points of interaction. TRAF2 encodes a protein that is a member of the TNF receptor associated factor (TRAF) protein family. This protein is required for TNF-alpha-mediated activation of MAPK8/JNK and NF-κB. It has a binding relationship with the filamin-A protein encoded by the FLNA gene. FLNA participates in the anchoring of membrane proteins to the actin cytoskeleton. This type of interaction analysis can be done for every pathway and used to identify novel differences between pathogen conditions. The visualization of mechanistic genes and arc weight enables an efficient identification of the differences between pathogenic influences.
Interrogating the Influence of Genes by Interactome Knockout Simulations
More detailed interrogation of the BMEL-Host model found that the gene, MAPK1, was significantly upregulated in the BMEL condition while not in MAP or STM. Further, it was observed that MAPK1 had a number of interactions with other genes within the MAPK Signaling pathway model that showed either very strong anti-correlated relationship such as the MAPK1->YWHAZ or strongly correlated relationship such as MAPK1->MAPT, while in MAP and STM host interactome models, the influence of MAPK1 was negligible. Specifically for the BMEL condition, we found that MAPK1 has a series of direct relationships with highly significant correlations as listed in . This could imply that MAPK1 has a unique role in BMEL pathogenesis during early host cell invasion. Mitogen-activated protein kinase 1 (MAPK1 or ERK 1/2) controls many biological functions (Johnson and Lapadat 2002
). The MAPK signaling cascade, represented by 3 well characterized subfamilies of MAPKs (ERK1/2, JNK and p38), has been implicated in bacterial internalization (Tang, Sutherland et al. 1998
) and intracellular survival and replication (Hobbie, Chen et al. 1997
; Palmer, Hobbie et al. 1998
; Schorey and Cooper 2003
). Jimenez-Bagues et al
. (Bagués, Gross et al. 2005
) demonstrated the importance that the integrity of the MEK - MAPK - ERK 1/2 pathway has on the elimination of rough Brucella suis
in macrophages. To identify the importance of this MAPK signaling pathway in BMEL invasion and intracellular survival in HeLa cells, a siRNA molecule (ID1449) was used to knock-down MAPK1 expression. Our results confirmed that the internalization of BMEL decreased more than 60% when the gene was knocked-down with the siRNA molecule as shown in .
MAPK1 gene interactions pairs that have significant correlated or anit-correlated relationships determined by interrogating the MAPK Signaling pathway model.
To gain better insight regarding the influence of MAPK1 on other genes, we employed the interactome model to simulate the MAPK1 knockdown in both the MAPK Signaling and Regulation of Actin Cytoskeleton models. The simulation identified a set of genes that were heavily influenced by MAPK1 as shown in the gene expression plots of in which the simulation data is plotted in comparison to the actual data used to train the interactome model. Interestingly, the simulation identified genes that were both in correlation with the reduced MAPK1 knockdown expression as well as several that had an increase in expression (anti-correlated). Either set of correlated or anti-correlated genes could be considered important contributors to the observed internalized reduction of BMEL in the HeLa host cells. For example, a correlated gene, SRF (serum response factor), is known to be involved in actin filament organization, regulation of cell adhesion, negative regulation of cell migration, negative regulation of cell proliferation, and regulation of transcription. Another correlated gene, MAPT (Microtubule-associated protein tau) is associated with regulation of microtubule depolymerization. The anti-correlated gene, YWHAZ (14-3-3 protein zeta/delta), is known to be involved in the biological processes of anti-apoptosis, histamine secretion by mast cell and signal transduction. The anti-correlated gene RHOA (Transforming protein RhoA) is associated with actin cytoskeleton organization, regulation of I-kappaB kinase/NF-kappaB cascade, and cell adhesion. This type of analysis is an integral part of the “incremental systems biology interactome modeling” process and introduced here as preliminary illustration as to how simulation/inferencing of the interactome model can be employed to guide next phases of in vitro and in vivo experimentation.
Biological System Model Generation for Comparative Pathogenicity Analysis
Comparative pathogenicity is a method by which the host response between different pathogens or pathogen vaccine candidates can be utilized to elicit unique and/or common biomarkers of immunogenicity. Utilizing BioSignatureDS™, the significantly perturbed pathways and gene groups from DBGGA were integrated to construct a plausible system level model of the STM wild type (WT) condition versus an isogenic ΔsipA, sopABDE2 mutant. The system model encompasses whole time-course patterns and multi-conditional behaviors of larger groups of genes and proteins than utilized only in the pathways. The system model expands the relationship of genes across related pathways and can be used for more efficient comparative modeling, pattern recognition and simulation supporting “what-if” type of analyses as previously described at the pathway level. The system model is constructed from a method based on merging of pathways with known gene/protein relationships and produces a trained and optimized network model similar to the MAPK signaling pathway network shown in . Following this procedure, a system model was constructed from 10 selected pathways (listed in ) showing significant perturbation between the host infected with STM WT and STM mutant. The resulting model has a common network structure that is trained using the host response data for the WT, mutant and control conditions and was comprised of 930 genes and over 1500 gene-to-gene relations. By interrogating the model, we identified a number of significantly differentially expressed genes (|Bayesian z-score| >= 2.24) as shown in the center heatmap of . From this heatmap, the difference in the STM mutant shown as green in the heatmap are a subset of genes which were found to be very highly up regulated in the STM mutant compared to the wild type and could be considered candidate genes governing the effective immune response of the host. These genes also form the basis of a biosignature that can be correlated to immunogenicity for more rational vaccine development. As expected for the WT, we found increased expression of genes associated with immune response such as those encoding IFNG, TNF, TLR4, and as well as genes associated with signaling and regulation of the actin cytoskeleton.