|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: MS JG FK. Performed the experiments: MS KT LW. Analyzed the data: MS FK. Contributed reagents/materials/analysis tools: JC YW. Wrote the paper: MS JG FK.
Biological signaling processes may be mediated by complex networks in which network components and network sectors interact with each other in complex ways. Studies of complex networks benefit from approaches in which the roles of individual components are considered in the context of the network. The plant immune signaling network, which controls inducible responses to pathogen attack, is such a complex network. We studied the Arabidopsis immune signaling network upon challenge with a strain of the bacterial pathogen Pseudomonas syringae expressing the effector protein AvrRpt2 (Pto DC3000 AvrRpt2). This bacterial strain feeds multiple inputs into the signaling network, allowing many parts of the network to be activated at once. mRNA profiles for 571 immune response genes of 22 Arabidopsis immunity mutants and wild type were collected 6 hours after inoculation with Pto DC3000 AvrRpt2. The mRNA profiles were analyzed as detailed descriptions of changes in the network state resulting from the genetic perturbations. Regulatory relationships among the genes corresponding to the mutations were inferred by recursively applying a non-linear dimensionality reduction procedure to the mRNA profile data. The resulting static network model accurately predicted 23 of 25 regulatory relationships reported in the literature, suggesting that predictions of novel regulatory relationships are also accurate. The network model revealed two striking features: (i) the components of the network are highly interconnected; and (ii) negative regulatory relationships are common between signaling sectors. Complex regulatory relationships, including a novel negative regulatory relationship between the early microbe-associated molecular pattern-triggered signaling sectors and the salicylic acid sector, were further validated. We propose that prevalent negative regulatory relationships among the signaling sectors make the plant immune signaling network a “sector-switching” network, which effectively balances two apparently conflicting demands, robustness against pathogenic perturbations and moderation of negative impacts of immune responses on plant fitness.
When a plant detects pathogen attack, this information is conveyed through a molecular signaling network to turn on a large variety of immune responses. We investigated how this plant immune signaling network was organized using the model plant Arabidopsis. Wild type and mutant plants with defects in immune signaling were challenged with a pathogen. Then, expression levels of many genes were measured using microarrays. Detailed analysis of the mutation effects on gene expression allowed us to build a signaling network model composed of the genes corresponding to the mutations. This model predicted that the network components are highly interconnected and that it is very common for network components that mediate different signaling events to inhibit each other. The prevalent signaling inhibitions in the network suggest that only part of the signaling network is usually used but that if this part is attacked by pathogens, other parts kick in and back up the function of the attacked part. We speculate that plant immune signaling is highly tolerant to pathogen attack due to this backup mechanism. We also speculate use of only part of the network at any one time helps minimize negative impacts of the immune response on plant fitness.
To understand the regulation of a particular biological process, it is important to elucidate what structural features of the signaling network regulating the process govern the behavior of the signaling network as a whole , . With a complex signaling network, in which components are highly interconnected, this is a challenging task. One problem is that the function of a sector of the network can be compensated by some other sector, and, consequently, functional identification of these sectors by knocking out each of the sectors is difficult. In this example of network compensation, it is assumed that these network sectors are functionally redundant but mechanistically distinct: they are not composed of homologous molecular components. General strategies to efficiently elucidate the structure of a complex signaling network are in demand.
The plant immune signaling network, which regulates defense triggered upon pathogen attack, is such a complex network. Two modes of plant immunity, pattern- and effector-triggered immunity (PTI and ETI) have been characterized in resistance against biotrophic and hemi-biotrophic pathogens . PTI is initiated by recognition of a microbe-associated molecular pattern (MAMP) by the corresponding pattern recognition receptor (PRR), which is typically integrated in the plasma membrane. For example, a fragment of bacterial flagellin, flg22, is a MAMP, and is recognized by the FLS2 receptor-like kinase PRR in Arabidopsis . Pathogens adapted to a particular plant host deliver effectors which interfere with PTI . Countering pathogen effectors, plants have acquired another class of receptors, resistance (R) proteins, that specifically recognize particular effectors, leading to induction of ETI. For example, the Arabidopsis R protein RPS2 indirectly recognizes the Pseudomonas syringae effector AvrRpt2 , .
Although the way pathogen attack is recognized is distinct between PTI and ETI, they are not separate, but rather form an integrated immune system. The intimate relationships between PTI and ETI have been suggested by the facts that many downstream events are shared. For example, in Arabidopsis, MAP kinases 3 and 6 are rapidly and transiently activated in PTI and activated for an extended period in ETI . Reactive oxygen species (ROS) production in PTI is absolutely dependent on the NADPH oxidase RBOHD, and ROS production in ETI is largely dependent on RBOHD , . The nitric oxide (NO) signaling sector comprised of NO-associated 1 (NOA1) protein and NIA1 and NIA2 nitrate reductases is also involved in both PTI and ETI , . Furthermore, similarities in the PTI and ETI transcriptome responses have been pointed out .
The signaling sectors defined by the phytohormones, salicylic acid (SA), jasmonic acid (JA), and ethylene (ET), are important in plant immunity: generally the SA sector for immunity against biotrophic and hemi-biotrophic pathogens and the JA and ET sectors for immunity against necrotrophic pathogens , , . The iso-chorismate synthase SID2 (ICS1)  and the MATE-type transporter EDS5  are required for SA synthesis in response to pathogen attack. NPR1  is a major positive regulator of SA responses. The regulators EDS1 and PAD4 are important for SA accumulation as well as SA-independent signaling functions , , . The JA sector contains the JAR1 enzyme that produces the JA-Ile conjugate, which is the active form of JA , the F-box protein COI1, which responds to JA-Ile by targeting the JAZ transcription repressors for degradation , and the JIN1 Myc transcription activator . The metal-ion transporter EIN2 is required for most ET responses , and the EIN3 transcription activator positively regulates some ET responses . Other phytohormones, such as abscisic acid, auxin, brassinosteroids, and gibberellins, are also involved in plant immune signaling . Although the phytohormone levels change during PTI and ETI, the specific effects of the phytohormone sectors in PTI and ETI had been considered to be limited or unclear , , .
Recently, we demonstrated that both flg22-triggered PTI (flg22-PTI) and AvrRpt2-triggered ETI (AvrRpt2-ETI) are mostly dependent on the signaling network defined by the SA, JA, ET and PAD4 sectors . Therefore, the signaling machinery is extensively shared between flg22-PTI and AvrRpt2-ETI. A main difference between PTI and ETI appears to reside in how the sectors in the common network interact one another. If this is true, then to further our understanding of the integrated plant immune signaling network, it is important to elucidate the global regulatory relationships among the network components.
One major use of mRNA profiles is as detailed descriptions of biological states, because an mRNA profile data set is a massive phenotypic data set. This use was pioneered by the “compendium” approach, in which mutations and chemicals that cause similar changes in mRNA profiles are hypothesized to be involved in the same biological processes . In our earlier studies, we implemented non-linear dimensionality reduction  in combination with graphical representation to reveal multi-dimensional relationships with locally variable dimensionalities among the mRNA profiles , , . In this way, information about the nature of similarities between mRNA profiles was obtained in addition to the scalar similarities, and novel relationships among Arabidopsis mutants and accessions were discovered.
Here, we report an integrated regulatory relationship model comprised of 22 components including most of the genetically-defined major regulators of immunity in Arabidopsis. The network structure was inferred based on mRNA profiles for 571 immune response genes of Arabidopsis mutants with defects in immune regulatory genes. The mRNA profiles were collected at a single time point six hours post inoculation (hpi) with the bacterial strain P. syringae pv. tomato DC3000 expressing the effector AvrRpt2 (Pto DC3000 AvrRpt2). This strain feeds multiple inputs to the network. The regulatory relationships were inferred by recursively applying a non-linear dimensionality reduction procedure, which allowed detection of many weak relationships. The model correctly predicted 23 out of 25 previously known relationships, suggesting the accuracy of newly predicted relationships. Two features of the network model were readily evident: the network components were highly interconnected; and negative regulatory relationships between signaling sectors were very common. We confirmed the latter point in one case by demonstrating a mutual inhibition between the SA and early MAMP-triggered (EMT) signaling sectors. Based on the prevalent negative regulatory relationships, we propose “sector-switching” as an important property of the plant immune signaling network.
mRNA profiling was used to collect detailed descriptions of the network state, and the changes in the network state were determined by comparing the mutant mRNA profiles with the wild-type mRNA profile. One advantage of this approach is that regulatory mechanisms defined by the mutations do not have to be regulated at the mRNA level. For example, a biological process that is regulated by the activity of a protein kinase can be studied using the mRNA profile of the protein kinase mutant even when the mRNA level of the protein kinase is not regulated in this process. This is because the mRNA levels of particular genes were not used as proxies for the activities of the gene products, instead, the mRNA profile changes in a mutant plant compared with the wild-type plant were used as the effects caused by the mutation. Another advantage is that the number of genes in the profiles need not be very high: the genes to be profiled only need to cover (almost) all the expression patterns across the mutants used in the study. We previously reported a dedicated custom microarray that accurately monitors the mRNA levels of 571 Arabidopsis genes, which represent mRNA profile patterns across many conditions related to pathogen infections . Use of this small-scale microarray made this project economical even though we used three biological replicates for profiling.
We collected mRNA profiles of the mutants and the wild type Columbia-0 (Col-0) from leaf tissues after inoculation of Pto DC3000 AvrRpt2. The inoculation dose was sufficiently high for most parenchymal cells to have direct contact with the bacteria. As parenchymal cells are predominant in leaf tissues, this biological system is relatively homogenous at the cellular level. The bacterial strain can stimulate multiple signaling pathways: AvrRpt2 triggers RPS2-mediated ETI, which involves SA-mediated signaling and ROS and NO bursts , , , ; the phytotoxin coronatine produced by the strain mimics JA-Ile and activates JA-mediated signaling ; MAMPs, such as flg22, trigger PTI, whose early responses include MAP kinase 3 and 6 (MPK3/6) activation, ROS and ET bursts, and callose deposition , , , . Thus, this strain feeds inputs into the network from multiple different points, which allows us to probe a large part of the network at once.
Arabidopsis mutants with defects in canonical immune signaling components were used to specifically perturb various points in the signaling network. Table 1 lists the Arabidopsis mutants used in this study, the functions of the corresponding genes, and their signaling sector assignments.
A single time point of six hpi was chosen for cost-effectiveness. The time point was determined based on our previous observations : the number of genes with expression changes was much higher at 6 hpi than 3 hpi; and while the profile at 9 hpi was similar to that at 6 hpi, we reasoned that the earlier profile may contain more relatively early effects of the genetic perturbations.
The principle used in network inference is that genes whose mutations cause similar effects on mRNA profiles share regulatory relationships: one regulates the other, both similarly regulate the mRNA levels of the same genes, both are regulated by the same regulator, or the relationships are a combination of these. Such regulatory relationships were visualized by links between the vertices corresponding to the mutant genes in a graphical representation of the network: a positive link when the direction of observed mRNA level changes was the same and a negative link when the direction was opposite.
The mRNA profiles were collected in multiple experiment groups and combined into a single data set using mixed linear models (MATERIALS AND METHODS, Table S1). The overall experimental design regarding the experiment group was not symmetric, and the overlapping genotypes in any particular combination of experiment groups were limited. These features may have introduced some biases in the data set. To compare mutation effects, log2-transformed expression values of genes in the wild type mRNA profile were subtracted from log2-transformed expression values of genes in each mutant mRNA profile, and the obtained log-transformed mRNA profile change was scaled across the genes, but not centered, to preserve the signs of the values (which is called a difference profile hereafter). Linear dimensionality reduction was applied locally (Locally Linear Embedding, LLE; ), so that the same types of mRNA profile changes do not make redundant links. Although the above procedure is in principle the same as used in our previous studies , , , we implemented an additional concept in the current study. In the previous procedure, mutant difference profiles that are local to a particular mutant difference profile are defined based on the global distance in the difference profile space. However, mutants that have a weak regulatory relationship, such as one corresponding to weak cross-talk, may not be detected as their difference profiles may not be located closely in the global space. In the new procedure, named Repetitive Euclidean-distance Locally linear Embedded Graph Generator (RepEdLEGG), the residual from the first round of LLE was subjected to another round of LLE. This recursive application of LLE enabled detection of such weak regulatory relationships (Figure S1). The overall workflow of the network inference procedure is summarized in Figure 1.
With the above procedure, we obtained a regulatory relationship model for 22 genes corresponding to the mutations with 67 undirected links, which we refer to as our network model (Figure 2). Our network model has a form of an undirected graph since a single time-point data set does not allow inference of the direction of relationships without an additional assumption. Forty-eight and 19 links represented positive and negative regulatory relationships, respectively (Figure 2A and 2B, Figure S2, Table S2). To evaluate the accuracy of the predicted regulatory relationships, the published literature was surveyed for supporting experimental data (Table S3). Twenty-five pairwise regulatory relationships between genes used in this study, that included information about the sign of the relationships, were found in published literature. Our network model correctly predicted 23 out of the 25 known regulatory relationships. One of the relationships not correctly inferred was the JIN1-MPK6 relationship: MPK6 was described as a negative regulator of JIN1  whereas our model predicts a positive relationship between them. The other was that the model did not predict a direct relationship corresponding to negative regulation of SID2 by EIN3, described in Chen et al. . However, when JAR1, which was connected positively and negatively with EIN3 and SID2, respectively, was removed from the input data set, the negative regulatory relationship between EIN3 and SID2 was inferred (Table S4). Under our experimental conditions, JA signaling could be strong due to coronatine and may have masked the effect of EIN3, which mediates ET signaling. Note that the known links were established with data from diverse experiments conducted using various Arabidopsis-pathogen interactions, performed by many different research groups. While such studies helped us to select useful mutants for our study, our network model was built based solely on mRNA profile data collected using a single experimental setup with a single time point. This fact demonstrates the richness of information in descriptions of the network state consisting of mRNA profiles and the high efficiency of network inference using mRNA profiles as detailed descriptions of network states. The high accuracy in prediction of previously known regulatory relationships suggests the accuracy of newly predicted regulatory relationships.
The specificities of links can also be examined by removing the data for one mutant from the data set. For instance, a positive link between EIN3 and MPK6 was predicted in the model. This is consistent with the observation in Yoo et al.  that EIN3 is phosphorylated and activated by MPK6. The direction of this regulatory relationship is from MPK6 to EIN3 but not from EIN3 in the ET sector to MPK6 : in other words, this link is specific to EIN3 but not for the ET sector in general. Therefore, a link between EIN2 in the ET sector and MPK6 should not be made if EIN3 is removed from the model (i.e., if the model is made using the data set with the ein3 difference profile removed). In the resulting model with EIN3 removed, the link between MPK6 and the other ET signaling component EIN2 was not generated (Figure S3A and A′, Table S5). Thus, the specificity of the biochemical regulation was captured in our network model.
It should be noted that each link may not represent a simple logical relationship. For example, the link between vertices A and B may represent expression changes in one subset of genes profiled and the link between vertices B and C may represent expression changes in a different subset of genes profiled. Therefore, among three vertices a circular link of positive, positive, and negative (e.g., links among MPK3, MPK6, and NHO1) does not necessarily present logical conflicts.
As expected, genes assigned to the same signaling sectors were predicted to have positive regulatory relationships except for the ROS sector (Figure 2A). Although RBOHD and RBOHF, the two respiratory burst oxidase homologues, were assigned to the ROS sector, it is known that single rbohD and rbohF mutants have different pathogen-responsive ROS accumulation and HR cell death phenotypes , . Consistently, difference profiles of the two mutants were uncorrelated (uncentered Pearson correlation coefficient between the expression changes from wild type: 0.043). Thus, it is reasonable that no positive link was predicted between the two RBOH genes.
Positive regulatory relationships between signaling sectors were also predicted. Among them, positive regulatory relationships between the NO and SA sectors were of particular interest. In our network model, NOA1 had links with NPR1 and PAD4. Indeed, the noa1 difference profile had higher correlation with the pad4 and npr1 difference profiles than the nia2 difference profile (uncentered Pearson correlation coefficients of 0.876, 0.831, and 0.714 with the pad4, npr1, and nia2 difference profiles, respectively). In the model made without NOA1, NIA2 replaced NOA1 in the links with the two SA sector components, PAD4 and NPR1 (Figure S3B and B′, Table S6). Therefore, the positive regulatory relationships between NOA1 and the SA sector components are not specific to NOA1, but they indicate positive regulatory relationships between the NO and SA sectors in general. On the other hand, the fact that the predicted regulatory relationships between NOA1 and the SA sector are stronger than those between NIA2 and the SA sector is consistent with the observation that NOA1, not NIA1/NIA2, is responsible for SA-induced NO accumulation .
Negative regulatory relationships are very common between signaling sectors in our network model while negative regulatory relationships within each signaling sector are absent. The NO sector was an exception as it does not have any negative links with other sectors. The JA sector had negative relationships with most of the other signaling sectors tested. The SA sector was negatively linked with PMR4, MPK3/6, and the ET and JA sectors. Prevalent negative regulatory relationships between sectors strongly suggest that a limited number of signaling sectors are highly activated at a given time as the active sectors suppress the other sectors.
Both the EMT and the SA sectors positively contribute to defense against the virulent strain Pto DC3000 , , . Figure 3A illustrates a subnetwork of our network model featuring the EMT and SA sectors. We consider that RBOHD, PMR4, MPK3/6, and the ET sector comprise the EMT sectors because RBOHD-dependent ROS production , PMR4-dependent callose deposition , MPK3/6 activation , and ET accumulation  are early MAMP responses. Note that although we designate them as the EMT sectors, RBOHD-dependent ROS production and MPK3/6 activation also occur for extended periods during ETI , . We previously reported that MAMPs can trigger accumulation of SA and thereby activate SA signaling , i.e., the EMT sectors positively regulate the SA sector. However, our network model contains negative links as well as positive ones between the sectors, suggesting that the regulatory relationships between the sectors can be positive or negative, depending on the context. In the following sections, we closely investigate this subnetwork of the EMT and SA sectors.
Callose deposition is a cell wall-based defense following recognition of pathogens . PMR4 is the callose synthase responsible for callose deposition upon infection with pathogens or treatment with elicitors , , . Our model predicted a negative relationship between PMR4 and SID2 (Figure 3A). It was previously reported that SA-mediated signaling is up-regulated in pmr4-1 plants , i.e., PMR4 negatively regulates the SA sector, which can explain the predicted negative regulatory relationship between PMR4 and SID2. Can SA signaling also affect callose deposition? We quantified callose deposition in the SA sector mutants, npr1-1 and sid2-2, after flg22 treatment according to the method described in Denoux et al. . Together with the SA sector mutants, pbs2-1 (a mutant with a RAR1 deletion) was included as a mutant with potentially enhanced callose deposition. RAR1 has a negative link with PMR4 in our network model, and different RAR1 alleles rar1-20 and rar1-29 were reported to have enhanced callose deposition phenotypes . The callose deposition level in cotyledons of 10 day-old seedlings grown in liquid culture was measured at 6 and 16 hours post treatment (hpt) with 1µM flg22 (Figure 3B). Consistent with a previous report , no significant difference in the flg22-triggered callose deposition level was observed at 16 hpt between Col-0 wild type and the SA sector mutants. However, the callose deposition levels at 6 hpt in the SA sector mutants were significantly higher than in Col-0. At 6 hpt, the callose deposition level in Col-0 was not significantly different from the flg22-receptor mutant fls2C, so the Col-0 level was the background noise level. These results indicate that flg22-triggered callose deposition is enhanced in the SA sector mutants at an early time point: the SA sector negatively regulates the PMR4 sector. Thus, negative regulatory relationships between PMR4 and the SA sector are mutual.
There is also a positive relationship between PMR4 and NPR1. It has been reported that pretreatment with SA can compensate loss of the flg22-triggered callose deposition caused by a pen2 mutation . The positive PMR4-NPR1 link may correspond to this SA-enhanced callose deposition in pen2 plants. Such context-dependent regulatory relationships involving PMR4 were anticipated as PMR4 has a higher number of links compared with other genes in our network model.
ROS production minutes after treatment with flg22 is one of the very early MAMP-triggered responses. RBOHD is required for flg22-triggered ROS production . A positive regulatory relationship between the SA sector and ROS production was predicted as RBOHD has a positive link with NPR1 in our network model (Figure 3A). We tested whether pretreatment with SA and/or a mutation in NPR1 affect flg22-triggered ROS production. Pretreatment of plant tissues with SA rather than co-treatment with SA and flg22 was chosen since flg22-triggered ROS production starts within a few minutes after addition of flg22. Col-0 and npr1-1 were pretreated with 5µM SA or water for 3 hours before they were treated with 1µM flg22 or water. Pretreatment with SA enhanced ROS production in Col-0 wild type during the period between 3 and 12 minutes after treatment with flg22 (Table S7). This enhanced ROS production was abolished in npr1-1, which indicates that SA positively regulates ROS production in an NPR1-dependent manner. This observation is consistent with the model prediction of an NPR1-RBOHD positive regulatory relationship (Figure 2A).
To further examine regulatory relationships between the EMT and the SA sectors, effects of SA and flg22 on the EMT and SA sectors, respectively, were examined using the mRNA level of a marker gene as a proxy for activity of each sector. Wild-type seedlings grown in liquid culture were treated with flg22 and/or SA. The mRNA levels of a putative chitinase (At3g43620)  and the PR-1 (At2g14610) genes were quantified for the EMT and SA sector activities, respectively (Figure 4). Induction of SA accumulation by flg22 was not significant at 3 hpt . We measured the marker gene mRNA levels up to 3 hpt, so SA accumulation caused by flg22 treatment was negligible. Treatment with 500 or 5 µM SA induced PR-1 mRNA accumulation by 3 hpt. An inhibitory effect of 1 µM flg22 on PR-1 mRNA induction was observed with 5 µM SA at 3 hpt but not with 500 µM SA. An inhibitory effect of 500 µM but not 5 µM SA on induction of the chitinase mRNA accumulation by 1 µM flg22 was observed at 3 hpt. Significant inhibitory effects of 1 µM flg22 and 500 or 5µM SA were not observed 1 or 2 hpt (Figure S4). Thus, the EMT and SA sectors have mutual inhibitory effects in a dose-dependent manner.
We used mRNA profiles of mutant plants for inference of regulatory relationships among the genes corresponding to the mutations. This use of mRNA profiles was pioneered by the “compendium” approach , and further developed, for example, to the “connectivity map” approach . However, these approaches focused on the most prominent similarities in the global space and did not intend to dissect combinations of similarities to reveal multi-dimensional similarity relationships among mRNA profiles. In our earlier work, we combined the LLE algorithm  and graphical representation to visualize differences among similarities in mRNA profiles of Arabidopsis mutants with variable local dimensionalities to reveal different mechanisms used in plant immunity . However, the analysis in our earlier work was limited to the local space defined by the global distance. In the current study we used RepEdLEGG, in which LLE was recursively applied to the residual of the first round of LLE. This approach enabled us to detect weak regulatory relationships and to reveal a highly interconnected network structure.
A limitation of using mRNA profiles as descriptions of the network state is that the resolution of the network is determined by the number of network states measured – e.g., in our study, the number of Arabidopsis mutants profiled. It should be noted that in our network model, when the genes corresponding to the mutations were linked, the link means that the genes or some other network components near the genes in the actual signaling network have regulatory relationships.
On the other hand, an advantage of this approach is that the regulatory mode of the gene defined by a mutation does not have to be transcriptional although mRNA profiles are used for network inference. For example, we detected the regulatory relationship in the MPK6-EIN3 link even though MPK6 does not affect EIN3 expression, but rather its phosphorylation. Because the plant immune signaling network contains many major non-transcriptional regulatory components , , , , this advantage of the approach was essential for us to obtain a global network model using a single methodology.
Using the predicted relationships between the EMT and SA sectors as examples, we have demonstrated that the resulting undirected regulatory relationships are highly informative in generation of hypotheses to guide intensive studies in focused parts of the network. We built this highly informative model in a cost-effective manner: mRNA profiling using a small-scale array at a single time point under a single experimental condition. Therefore, applications of this approach should be beneficial in studies of complex signaling networks in any genetically tractable organisms.
Complex regulatory relationships among the network components strongly suggest that many relationships are dependent on context, such as the quantities and the states of other network components. To deepen our understanding of the signaling network, it will be important to elucidate the dynamic relationships among the network components. As the cost of mRNA profiling is rapidly decreasing, it will soon be practical to collect mRNA profiles of wild-type and many mutant plants at many time points. Such time-series mRNA profile data will enable extension of our network model to include information about network dynamics.
Furthermore, cost reduction in mRNA profiling will improve applications of the approach used in this study. First, it could allow a symmetric and highly-overlapping experiment group design, which would reduce potential biases in the data set. Second, it could allow inclusion of mRNA profiles from uninfected plants of all the genotypes. Inclusion of such profiles would enable separating the genotype effect and the genotypeinfection interaction for each profiled gene, which we cannot do with the current data set that only includes infected plants. However, expression level information from many genes is combined as the network state description in our approach. Different genes have different ratios between the genotype effect and the genotypeinfection interaction. A data set that includes information from such genes allows incorporation of information about the genotype effect and the genotypeinfection interaction in the network inference. This may have contributed to the success of our approach in the absence of mRNA profiles from uninfected plants. Third, cost reduction could allow profiling of many more genes. If many more genes are profiled, some aspects of the network states that evaded detection in mRNA profiles of a limited number of genes (571 genes in this study) may be detected, which could lead to discovery of additional weak regulatory relationships among the network components.
Implementation of RepEdLEGG was a key to building the highly interconnected network model. Thirty-two out of 67 links predicted were obtained in the second round of LLE using the residuals from the first round of LLE as the response. Eight out of the 32 links found in the second round were supported by previous evidence. These links found in the second round connect vertices whose global distances are not particularly small and represent weak regulatory relationships. The validities of many links found in the second round of LLE indicate that common multivariate analysis methods that depend solely on the global distance are not ideal for inference of a highly interconnected network.
Among existing methods, partial correlation is a method that can detect weak regulatory relationships , like RepEdLEGG. The partial correlation between vertices X and Y is defined, when all the other vertices are Z1, …, Zn, as the correlation between the residual of the linear regression of X with Z1, …, Zn and the residual of the linear regression of Y with Z1, …, Zn. When the results of RepEdLEGG and the partial correlation were compared using the data set used in this study (q<0.01), 51 links were predicted in common (Figure S5). There were 16 and 5 links unique to RepEdLEGG and the partial correlation, respectively. Whereas 7 out of the 16 links uniquely predicted by RepEdLEGG had supporting literature evidence, none of the links unique to the partial correlation did. This result suggests a higher accuracy of inference by RepEdLEGG than by partial correlation. We speculate that the difference between the two methods resulted from a difference in the size of the space that is considered linear for each vertex. While RepEdLEGG constrains the linear space to that delimited by the neighboring vertices found in the first and second rounds of LLE, partial correlation assumes that the entire global space is linear. Although RepEdLEGG is hampered by the arbitrariness in determining the size of the linear space (i.e., determining the number of neighbor vertices), the superior performance of RepEdLEGG over the partial correlation suggests that constraining the size of the linear space is important in modeling of a complex regulatory network.
Guided by our network model, we have demonstrated that the EMT and SA sectors can antagonize each other. Such mutual inhibition is not intuitive since both sectors positively contribute to resistance against Pto DC3000 , . In addition, it appears to contradict our previous report that MAMPs trigger SA accumulation , which is equivalent to positive regulation of the SA sector by the EMT sectors. It should be noted that two important aspects, kinetic and quantitative effects, are overlooked in these simplified arguments. The induction of SA accumulation by flg22 clearly takes longer than 3 hpt  while the mutual inhibition between the EMT and the SA sectors was evident at 3 hpt (Figure 4). In addition, we observed dose dependence in the mutual inhibition: inhibition of the SA sector by flg22 was effective only when SA signaling was weak while inhibition of the EMT sectors by SA was effective only when SA signaling was strong (Figure 4). We think that such kinetic and quantitative effects play important roles in coordinating positive and negative regulatory relationships between these sectors.
The plant immune system must be robust against various perturbations caused by pathogens, which typically evolve much faster than plants. At the same time, not only are immune responses energy-expensive  but at least some are also detrimental to the plant fitness , , . Therefore, ideally immune responses should be contained at the minimally necessary level. We speculate that to balance these apparently conflicting selection pressures, the EMT and SA sectors adjust the level of immune responses according to demand through the positive and negative regulatory relationships between them (Figure 5). When the plant is attacked by a pathogen, the EMT sectors are activated based on recognition of MAMPs. While the activation of the EMT sectors starts the activation of the SA sector with a delay, the SA sector does not become highly activated due to suppression by the strongly-activated EMT sectors. This is probably because detrimental effects of defense components controlled by the EMT sectors are less severe than those of the SA sector: if defense components controlled by the SA sector are not necessary, it is better not to activate them. The delay in activation of the SA sector by the EMT sectors is important in buying time for evaluation of the effect of the EMT sector-mediated defense. However, if the pathogen is to some extent adapted to the plant host and its effectors interfere with the EMT sectors, the resulting weakened activity of the EMT sectors could release the SA sector from suppression. In fact, several P. syringae effectors, such as HopAI1 , target components of the EMT sectors. Using the SA sector-controlled defense components against more virulent pathogens is reasonable, as the SA sector-controlled defenses are known to be potent in defense against biotrophic and hemi-biotrophic pathogens . Thus, an elaborate combination of positive and negative regulatory relationships between the EMT and the SA sectors may enable shifting the balance between the EMT sectors for defense against less virulent pathogens to keep negative impacts of the immune response on plant fitness low and to reserve the SA sector for defense against more virulent biotrophic and hemi-biotrophic pathogens.
In our network model there are many inter-sector regulatory relationships. Such a high connectivity suggests a democratic network, in which each component of the network has a relatively small contribution to the function of the network and the level of contribution from each component is similar. We recently demonstrated that the AvrRpt2-ETI is robust against network perturbations because of positive contributions from each sector to immunity and compensatory interactions among them . So, the network for AvrRpt2-ETI signaling appeared to be democratic. However, our current study showed that negative regulatory relationships are very common between different signaling sectors, such as between the EMT and the SA sectors. We speculate that the EMT and SA sectors are not exactly democratic: one of them is more active under a particular condition, and the other is suppressed by the active one; if the active sector is inhibited, the other sector gets activated to compensate. So, the apparent redundancy in immune signaling does not result from simple functional redundancy but from switching between the sectors. The prevalence of inter-sector negative regulatory relationships suggests that such sector-switching is common at the whole network level, not just between the EMT and SA sectors. In fact, an antagonistic relationship between the SA and JA sectors is well documented . We propose to call this property of the signaling network “sector-switching”. If robustness of the immune system against fast-evolving pathogens had been the only driver in evolution, the signaling network could have evolved to be a simple redundant, democratic network. However, immune responses are generally deleterious to the host, and they impose fitness costs when the pressure from particular pathogens is not high . Together with the demand to minimize negative impacts of immune response, we speculate that the signaling network has evolved to have a sector-switching property, so that the activities of the signaling sectors are switched in response to inputs to the network, such as inputs for induction of PTI and ETI, and to external perturbations, such as perturbations by pathogen effectors, to balance the performance and the negative impacts of the integrated immune system.
All Arabidopsis plants, wild type and mutants, used in the study had the genetic background of accession Col-0. For mRNA profiling and ROS production assays, plants were grown in a controlled environment chamber at 22°C with 75% relative humidity and a 12h/12h light/dark cycle. For the assays using seedlings in liquid culture, seedlings were prepared essentially as described in Denoux et al.  with the following modifications: 0.25g/L as the concentration of sucrose in the culture medium, and the culture was incubated at 22°C. Pseudomonas syringae pv. tomato DC3000 carrying pLAFR3-avrRpt2 (Pto DC3000 AvrRpt2)  was used for inoculation of plants subjected to mRNA profiling.
Pto DC3000 AvrRpt2 was cultured in King's B medium at room temperature (~22°C) overnight and inocula were prepared at an OD600 of 0.05 in water. Leaves were infiltrated using a needle-less syringe as described in . The flg22 peptide (QRLSTGSRINSAKDDAAGLQIA) was synthesized by EzBiolab Inc. (IN, USA) and was used at indicated concentrations. Sodium salicylate (Fisher Scientific, PA, USA) was used to prepare SA solutions at 5 or 500 µM. For treatment of seedlings, plates were centrifuged at 500 rpm for 10 seconds to remove condensation 1 day before treatment.
Twenty-two mutants were divided into five experiment groups, and three biological replicates were made for each group, except for one (group 00) with two biological replicates. The data collection for the biological replicates was conducted at least one week apart. Each experiment group consisted of Col-0 in addition to seven mutants. Detailed information about grouping is provided in Table S1. The eight plants were grown at the outside positions of a 3×3 grid pattern in a 6″×6″ pot, and an additional Col-0 plant, which was not used for data collection, was grown in the center position of the grid pattern. The positions of the eight plants in each pot were randomly assigned. Some mutants in these experiments were irrelevant to this study and were excluded from analyses following normalization of mRNA profiles. The 5th experiment group (group 00) consisting of one or two mutants used in each of three experiment groups (groups 01, 02, and 03) and Col-0 was included to reduce potential bias associated with the experiment groups, e.g., biases associated with particular dates when experiments were conducted or particular combinations of genotypes tested together.
Two fully-developed leaves of each 4 week-old plant were inoculated with Pto DC3000 AvrRpt2. For each mRNA profile, inoculated leaves were harvested from three plants of the same genotype from three different pots at 6 hpi and pooled.
Procedures from target preparation to microarray data collection were performed as described in Sato et al. .
Raw expression data were normalized using the stable gene-based quantile normalization (SBQ) method . For comparison of profiles among different plant genotypes tested in different experiment groups, it was necessary to compensate for potential bias caused by separating genotypes to different groups. A 2-stage mixed effect linear model was fitted to the data from each experiment group separately:
where Y, G, T, R, γ, and ε are log2-transformed expression level value, gene, genotype, replicate, residual of the 1st model, and residual of the 2nd model. G and T are fixed effects, and R, γ, and ε are random effects. The second model was fitted for each gene separately.
Using the G:T values for the genotypes common between pairs of the experiment groups, calibration values among the experiment groups were calculated for each gene. The values in the initial SBQ-normalized data set containing all the experiment groups were corrected using the calibration values and were used to fit another 2-stage model:
where Y, G, T, E, R, γ, and ε are log2-transformed expression level value, gene, genotype, experiment group, replicate, residual of the 1st model, and residual of the 2nd model. G and T are fixed effects, and E, R, γ, and ε are random effects. The second model was fitted for each gene separately. The contrasts in the model were made to obtain the difference value between each mutant and Col-0 in each Tt + G:Tgt.
A data set with 480 genes each of which had at least one mutant genotype with the significant log2-transformed ratio value (q<0.05) were used to compare mRNA profiles of the genotypes (480 genes × 22 genotypes). The log2-transformed ratio values were not centered but scaled across the genes for each genotype (difference profiles). In this way, the order of the pairwise distances of the genotype difference profiles is invariant when either the uncentered Pearson correlation coefficient or the Euclidean distance is used. EdLEGG was modified from LEGG  to use the Euclidean distance instead of the uncentered Pearson correlation, so that multiple regression can be used for the calculation. Briefly, in a data set of n genes × m genotypes, the difference profile of genotype i is denoted as a vector in an n-dimensional space. For the vector of each genotype i, k closest neighboring genotype vectors were identified using the uncentered Pearson correlation coefficient. Pi is the set of such j (). The value k defines the size of the local space. Then the following multiple regression was fitted by minimizing the residual vector size :
In this first round of EdLEGG, the condition, , was applied to allow only positive regulatory relationships for the identification of major components illustrated in Figure S1B. k=6 was used in this study as this made some of aij for most i insignificant, which suggests that each local space was sufficiently sampled.
In RepEdLEGG, each residual vector was subjected to a second round of EdLEGG. For each , l closest neighboring genotype vectors were identified using the absolute value of the uncentered Pearson correlation coefficient. Qi is the set of such j (). In this way, the genotype vectors that are negatively correlated as well as positively correlated can be identified as neighbors, which allows detection of both negative and positive regulatory relationships. The following multiple regression was fitted by minimizing the residual vector size :
In this second round, the coefficients bij were allowed to take positive or negative values to include negative regulatory relationships. l=5 was used in this study as this made some of bij for most i insignificant, which suggests that each local space was sufficiently sampled. The p-value associated with each of the coefficients aij and bij, obtained from multiple regression, was corrected using the Benjamini-Hochberg False Discovery Rate (FDR)  to obtain the q-value, and the neighboring genotype vectors with coefficients significant for the indicated q-value threshold, , were identified for each genotype i ().
The output of RepEdLEGG was further evaluated using a leave-one-out (LOO) cross-validation. In each case, the profile for one of the 22 mutants was removed from the data set, and this LOO data set was subjected to RepEdLEGG analysis. Links that were found in at least 18 LOO cross-validation cases were considered significant. Note that for a particular link, 20 LOO data sets have both the genotypes flanking the link.
Then, all the LOO-filtered neighboring genotype vectors from both rounds were subjected to multiple regression together to obtain the final coefficients cij, which could be positive or negative, and their associated p-values by minimizing the residual vector size, :
The obtained p-value was FDR-corrected to obtain the q-value. When two significant coefficients were found for a single link (aij and aji), the coefficient with the smaller q-value was selected.
In the model, the significant links between the mutant genotypes are represented as the links between the genes corresponding to the mutations. The links are color-coded in Figure 2 according to their associated coefficient values.
To collect experimentally validated regulatory relationships, a systematic search of literature describing the 22 genes in our network model was conducted. “LocusPublished.20091204.txt” in TAIR (ftp://ftp.arabidopsis.org/home/tair/User_Requests/LocusPublished.20091204.txt) was used to map genes to literature. A custom Perl script was used to parse information about each gene of interest in the file to identify publications, each of which was simultaneously mapped to any pair of the 22 genes, and to generate hyperlinks to the PubMed records (http://www.ncbi.nlm.nih.gov/pubmed/) for the identified publications. Next, the contents of the identified publications were inspected for appropriateness. This relatively unbiased procedure identified 22 known regulatory relationships. Three more known regulatory relationships were added based on publications that were not included in “LocusPublished.20091204.txt” but that we knew. To our knowledge, these 25 relationships are the only relationships known for the 22 genes.
Ten day-old Col-0 seedlings grown in liquid culture were incubated with 1 µM flg22 for 6 or 16 hours. Cotyledons were harvested for staining with aniline blue. Staining and visualization procedures were described in Wang et al. . One image was obtained from each cotyledon. Stained callose deposits were counted using a custom macro combined with a custom plug-in for Image J (http:// rsb.info.nih.gov/ij/). The macro performs noise reduction, binarizing images, and counting objects with filtering for a particular size range.
Six week-old adult plants grown under the conditions described above were used. Eight leaf discs with a diameter of 4 mm were prepared and incubated for ~15 hours in sterilized water in 24-well flat-bottom cell culture plates (Corning, Inc., MA, USA) before pre-treatments with water or 5 µM SA. Leaf discs for mock and SA pre-treatments were collected from each half of the same leaves. Eight leaf discs were used for a single sample, and four replicated samples were made using different individual plants for each combination of genotype and treatment. Leaf disks pre-treated for 3 hours were then treated with 1 µM flg22 or water. These were considered to be four conditions: 2 pre-treatments×2 treatments. The ROS production level was measured as the relative luminescence value as described in Trujiro et al. . The results were analyzed by fitting a polynomial linear model through the ROS production curves of individual measurements and using a mixed-effect linear model on the coefficients of these curves :
where F, G, T, Tm, S, R, and ε are measured ROS production value, genotype, condition, time, sample, replicate, and residual, respectively. G, T, and Tm are fixed effects, and S, R, and ε are random effects. To avoid convergence problems, the coefficients of the (1+Tm+Tm2+Tm3+Tm4)|Sijk random effect were assumed to be independent and time was centered and scaled to range from −1 to 1.
Ten-day old Col-0 seedlings were treated with SA at an indicated concentration and/or 1 µM flg22, or water for 3 hours and harvested for RNA extraction. RNA extraction and quantitative RT-PCR were performed as described in Tsuda et al. . The Ct values of a putative chitinase (At3g43620) and PR-1 relative to Actin2 (At2g18780) were fitted to a mixed linear model:
where C, G:T, R, and ε are relative Ct value, genetreatment interaction, replicate effect and residual, respectively. G and T are fixed effects, and R and ε are random effects. The mean estimate of the genetreatment interaction was used as the modeled Ct value. For the t-tests, the standard error appropriate for each comparison was calculated using the variance and covariance values obtained from the model fitting.
The Gene Expression Omnibus (GEO) (http://www.ncbi.nlm.nih.gov/geo) accession numbers for data discussed in this paper are GSE19663 and GSM490922 to GSM490978.
Analysis of residuals from the first round of LLE in RepEdLEGG allows detection of minor similarity (Conceptual diagram). For the sake of visualization, log-transformed expression level ratio values of two genes (i.e., two dimensions) are plotted for 70 mutant plants, which correspond to the data points. (A) Major similarities among the data points can be identified as clusters in the global space. Four clusters are indicated by different colors of the data points. (B) For each of the red and blue clusters, major components (thick gray arrows) are identified. This can be achieved by the first round of LLE. (C) Once the major components are subtracted from the red and blue clusters (i.e., residuals), minor similarities (the star-like shapes of the clusters) can be identified. Note that since it is impossible to represent events in a high-dimensional space accurately in a two-dimensional space, this figure is by no means an accurate representation of the RepEdLEGG procedure. Instead, the purpose of this figure is to illustrate the idea that analysis of residuals allows detection of minor similarities in expression profile data, which correspond to weak regulatory relationships among the genes corresponding to the mutations.
(4.81 MB TIF)
Categories of links inferred by RepEdLEGG. (A) Links inferred by the RepEdLEGG procedure are classified into those inferred in the first and second rounds of the LLE procedure. The links inferred in the second round are further divided into positive and negative links. (B) Links with supporting, conflicting or no evidence. The chart in the inset indicates the proportion of evidence-supported links inferred in the first and second rounds. 1st evidence (+), evidence-supported links inferred in the first round of LLE; 2nd+ evidence (+), evidence-supported links for positive regulatory relationships inferred in the second round of LLE; 2nd- evidence (+), evidence-supported links for negative regulatory relationships inferred in the second round of LLE. Supporting evidence is listed in Table S3.
(0.24 MB TIF)
Analyses of the link specificities. (A and A′) The link between EIN3 and MPK6 is not ET signaling-dependent. To test specificity of the link between EIN3 and MPK6, the ein3 profile was removed from the data set, and the RepEdLEGG analysis was performed. A, The links involving EIN3 are highlighted in the full model (Figure 2C). A′, The links involving EIN2 are highlighted in the model with EIN3 removed. (B and B′) The NO sector has positive relationships with the SA sector. To analyze the specificity of links between the NO sector components (NOA1 and NIA2) and the SA sector components, the RepEdLEGG analysis was performed with the data set with the noa1 profile removed. B, The links involving the NO signaling components are highlighted in the full model (Figure 2C). B′, The links involving NIA2 are highlighted in the model without NOA1. Links inferred 17 times in the LOO cross-validation results were considered significant when a data set with one mutant profile removed is used. Note that a link can be inferred 19 times at maximum when one component is removed. The color codes of the links were determined based on coefficients associated with the links. The color codes for the vertices at the bottom of the figure show the signaling sector assignments of the genes corresponding to the mutations.
(1.01 MB TIF)
Time-course analysis of mutual inhibition between the EMT and SA sectors. The mRNA levels of the PR-1 and putative chitinase (At2g43620) genes were measured by qRT-PCR and used as proxies of the SA and EMT sector activities, respectively. The Actin 2 mRNA level was used to normalize the mRNA measurements. Ten-day old seedlings were treated with indicated concentrations of SA and/or flg22 and harvested for mRNA measurements at the indicated times. *, p<0.05 for the indicated comparisons.
(0.32 MB TIF)
Comparison of RepEdlEGG with partial correlation. The expression ratios between mutants and Col-0 wild-type (22 genotypes×480 genes) were analyzed using RepEdLEGG and partial correlation with LOO cross-validation. Links inferred 18 times in the LOO results were considered significant.
(0.19 MB TIF)
Experiment groups in mRNA profiling
(0.03 MB XLS)
Statistics for the Arabidopsis immune signaling network model
(0.04 MB XLS)
Regulatory relationships among the 22 genes supported by literature
(0.03 MB XLS)
The EIN3-SID2 relationship is masked by JAR1 in the full model.
(0.04 MB XLS)
Statistics for the network model without EIN3 (for Figs. S3 A and A′)
(0.04 MB XLS)
Statistics for the network model without NOA1 (for Figs. S3 B and B′)
(0.04 MB XLS)
Statistics for effects of SA pretreatment on flg22-triggered ROS production
(0.03 MB XLS)
We thank Zheng Jin Tu and the Minnesota Supercomputing Institute for access to and assistance with high-performance computers, Nick Hahn and the High Throughput Analysis Facility at University of Minnesota for operation of a liquid-handling robot, and Ken Vernick for letting us use his GenePix 4000B scanner. We thank Miguel Torres and Jeff Dangl for atrbohD, atrbohF and atrbohD/atrbohF double mutants, Jian-Min Zhou for nho1-2, and Paul Schulze-Lefert for pen2-1, sag101-2, and pad4-1/sag101. We also thank all the members of the Katagiri and Glazebrook laboratories for valuable discussion.
The authors have declared that no competing interests exist.
This work was supported by grant IOS-0419648 (Arabidopsis 2010 program) to J.G. and F.K. and grant MCB-0918908 to F.K. from the National Science Foundation and grant DE-FG02-05ER15670 from the Department of Energy Biosciences to J.G. M.S. was a recipient of a Research Fellowship of the Japan Society for the Promotion of Science for Young Scientists. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.