Living organisms need to sense the temperature of their surroundings. The heat-shock response network provides this sensing and controls the adaptation and survival of the cell. This network is a highly conserved genetic network that is connected with many other signaling pathways.
Different aspects of this network were studied and different architectures were proposed in computationally oriented studies [
1-
4]. Our starting point was the network described in [
2] because in that network architecture, temperature influences the hsp70 promoter activity directly. We were interested in the hsp70 promoter activity because our molecular construct is a Hsp70-GFP fusion gene transfected in Chinese hamster ovary (CHO) cells, as described in [
5]. However, two other architectures from [
3,
4], use a different input for the temperature stress, an input that was proposed for the first time in a computational network by [
1]. It is thus important to compare different heat shock network architectures. To this end and to connect our approach with the networks based on molecular interactions, we superimposed in Figure three heat shock networks from [
2-
4], each network with a specific color and each molecular species named as in [
2-
4].
A key element in Figure is the heat-shock transcription factor-1 HSF, which is transiently activated by elevated temperatures. HSF translocates to the nucleus upon elevated temperatures, forming homotrimeric complexes. The HSF3 homotrimers bind to the heat shock element HSE on the DNA and control the expression of the hsp70 gene. The Hsp70 proteins protect cells from thermal stress. Thermal stress causes the unfolding of proteins, perturbing thus the pathways under their control. By binding to these proteins, Hsp70 allows them to refold and prevents their aggregation.
Some interactions from Figure are common to two networks, and are represented by lines with different colors. Only the binding and unbinding of the HSF to HSP70 and the HSP degradation appear in all three networks.
Although the networks share few common molecular species and have different architecture, they all share a common theme, namely they all have an entrance module that processes the input signal, which is the heat shock temperature as a function of time, T(t).
In [
2] the temperature acts on a kinase module of Goldbeter and Koshland type [
6]. The temperature changes the rate of conversion of the inactive kinase S into its active form
S* Figure . The transition between the inactive and active form is modeled by
where Stot = S + S* is the total concentration of the kinase. The parameters Km, k, Km, p represent the Michaelis-Menten constants whereas Vm, k and Vm, p represent the maximal rates of phosphorylation and dephosphorylation of S*, respectively.
The ratio
Vm, k/
Vm, p controls the switching between inactive and active states of the kinase [
6] and it was chosen in [
2] as the variable that couples the heat shock network with the external environmental input temperature T(t)
In [
3,
4] the temperature does not act through a kinase module, but directly on the proteins as a whole. No distinction was made between different families of proteins from the cells. The proteins were grouped in two classes: misfolded by the heat shock or correctly folded. Following [
1] the fraction of the misfolded proteins, as a function of temperature, used in [
3,
4] is
where, T(t) is in degree Celsius. In [
3], the temperature influences the network through the rate of change of the misfolded proteins (MPROT in Figure
whereas in [
4] the temperature influences both the misfoled and the correctly folded (PROT in Figure ) proteins
A second common theme for all heat shock networks from [
2-
4] is an in depth study of the time variation of the hsp70 promoter activity. In [
3,
4] the promoter activity is controlled by HSF
3:HSE, whereas in [
2] by P:HSF:HSE. All three networks from Figure incorporate the experimental findings of [
7], that the promoter activity starts from a basal level and rapidly increases, once the stress is applied, reaches a maximum level and attenuates slowly back to the basal level. In [
7] the stress was applied for 250 min at 42°
C on HeLa cells. The rapid increase of the promoter activity spans tens of minutes (about 30 min in [
7]) whereas the attenuation period extends over hundreds of minutes (about 200 min in [
7]). This phenomenon, referred as transient activity, is the hallmark of all three network studies [
2-
4]. In [
2,
3] the hsp70 mRNA has a transient activity also, which is regulated by
S* in [
2].
The transient activity is experimentally probed in [
4] with the help of a yellow fluorescent protein YFP as a reporter. The transactivation of the yfp gene is controlled by its own HSE' elements. In [
4] the connection between the reporter rate of accumulation and the HSF
3:HSE' is
The constant
k1 describes the transcription/translation kinetics, [
4], whereas
k2 describes the degradation of the YFP protein. Using the networks from [
2,
3] we write the YFP accumulation rate as a function of the yfp-mRNA
In all cases considered, based on the network models, the rate of change of the reporter protein has a transient activity. We can write thus, in general, that
Following [
2,
3], Transient(t) depends on mRNA and its controlled regulation, as we discussed in (7). Following [
8], we may argue that Transient(t) depends strongly on the translation initiation effects elicited by the heat stress.
The architecture of the heat shock network will change, as more experimental data will accumulate, but the transient activity of the rate of accumulation of the reporter controlled by the hsp70 promoter will still be present. What will change is the dependance of the Transient(t) function on specific molecular species. For example, the transcriptional corepressor CoREST, which may be important to be included in any new heat shock system network architecture, may influence Transient(t) [
9].
The molecular construct we used to measure the response to thermal stress is a Hsp70-GFP fusion gene transfected in Chinese hamster ovary (CHO) cells, as described in [
5]. The time profile of the GFP protein is recorded using flow cytometry measurements, a technique that allows a quantitative measurement of the fluorescence of a large number of cells (10
4).
Considering a similar model as in (8) for our GFP reporter, we get
To obtain a mathematical representation for Transient(t), we will use the following parametrization of the GFP response
This form for response was also used in [
5] to study the response to one heat shock pulse of a definite temperature T and duration D, Figure . This is not the only possible parametrization for the GFP accumulation. Some parametrization may be based on the Michaelis-Menten approach, including a Hill coefficient. However, (10) proved very useful in constructing a stochastic model for the GFP accumulation in [
5] and it is easy to use it to model a transient activity.
The parameters a and b in (10) depend on the input shock. The GFP(t0) is the value of the GFP measured at some time t0 after the end of the shock. From a theoretical point of view, the time t0 can be any time after the end of the shock. Practically though, the time t0 must be chosen so that the interval between the end of the shock and t0 is not very long compared with 24 hours. After the shock, to measure GFP(t), samples were taken for the next 24 hours at a rate of one sample at every two hours. We sampled, simultaneously, 13 different shock conditions, for the next 24 hours. The delivered shocks, for different conditions, did not end at the same time. As a consequence the time t0 for the first sample was different for different conditions. The experimental values for t0 covered a range from 0.5 hours to 1.8 hours.
The expression (10) is valid only after the end of the heat shock and covers a time range of about 24 hours.
With the help of (10) and (9) the function Transient(t), measured after the end of the shock to the next 24 hours, is expressed in the form
From now on the transient activity and Transient(t) defined above will be used interchangeably. This gives a parametrization for the transient activity in terms of two parameters a and b. Based on the discussions related to (6,7), the parameters a and b depend on molecular species that are part of a yet unknown heat shock network.
The temperature dependence of the Transient(t) is therefore a combination of different pathways and feedback loops. A detailed understanding of the entire network responsible for the heat-shock response will make it possible to deduce the temperature dependance of the Transient(t). Because a detailed understanding of the network is missing at present, our approach will be top-down. Namely, our goal is to obtain Transient(t) from experimental data that describe the time course of the GFP response to input thermal stress.
In other words, instead of looking at HSF3:HSE' or mRNA or other molecular species of an incompletely defined heat shock network, as the output signal of the heat shock system, we consider a and b, that describe the transition activity, as the output of the heat shock system. Our procedure does not imply that searching for a mechanistic architecture for the heat shock system is futile. Contrary, we believe that it is worthy of finding the parameters a and b in terms of concrete molecular species. However, this is not the purpose of this study. Out of the two parameters a and b, we will focus on b because it behaves like a time constant in (11) and thus describes the Transient(t) life time as a function of the input temperature.
To find the mathematical model for b as a function of the input temperature, we will be guided by the parallel-cascade system from Figure .
In many applications, nonlinear systems are described using simplified structures, [
10,
11] and reference therein. The structure of Figure is composed of parallel branches, with each branch represented by a series connection of a linear time-invariant system (L) connected with a memoryless nonlinear system (N). The transfer function for the linear system (L) is described by a convolution integral [
12], whereas the nonlinear memoryless system (N) is described by a nonlinear function
f(
t). The output
y(
t) of the LN parallel-cascade system of Figure is
where the summation index
i runs over all branches. For each branch, the function
gi(
τ) rep-resents the linear system and
fi(
τ) the nonlinear system, respectively. The importance of this setting is that a large class of nonlinear systems can be uniformly approximated using a parallel combination of several L and N cascade systems [
10]. The structure of Figure is not the only possible structure. For example, [
13] uses for each branch three systems, linear-nonlinear-linear, so the parallel system is an LNL cascade. As an example of a general theorem that describes the uniform approximation of nonlinear systems using L and N structures, we state, without proof, the theorem of Palm [
13]. A time-invariant, causal, finite-memory system for which small changes in the the system input result in small changes in the system output, can be uniformly approximated by a parallel-cascade system formed with a finite number of branches, each of which contains an LNL cascade, [
11,
13].
For the present study, we will use a single branch with an NLN structure, Figure . We chose the NLN structure because each simple system represents a process that is important for the heat shock response. The first nonlinear system (N) from the NLN structure represents the amplification of the small changes in the environmental temperature. Similarly with (2,4,5), the functional task of the first module N from our cascade model is to processes the heat shock temperature T(t).
The linear system (L) represents the thermotolerance without a recovery period. Thermotolerance is the ability of cells to better tolerate a strong second heat shock once previously exposed to a first moderate shock [
14]. An entire subsection is devoted below to the phenomenon of thermotolerance without a recovery period and its connection to the linear system (L). The last system (N) represents the transition of the cell's response from a mild to a severe shock. We will explain each block and its corresponding function in the next sections.
In general, the identification of a nonlinear system using a parallel-cascade structure does not require its N and L systems to bear resemblance to the physical nature of the modeled system. For our study, however, we ask for resemblance because we view the model structure in connection with a molecular heat shock network.
The kernel functions
f(.) and
g(.) that describe the nonlinear and linear blocks, respectively, are usually represented as polynomial, rational or exponential functions. We will use exponential functions to construct the kernels of the N and L systems. One reason for using exponential functions is that the amplification (the first N system in the NLN casacde) and the thermotolerance without recovery time (the linear system L) become very simple kernel functions, namely one exponential for the first N system and another one for the L system. Another reason is that we found experimentally that small temperature variations largely influences the response output of the heat shock network. Amplification of small variations is conveniently described by exponential functions. We notice also that in (3) the temperature enters through an exponential function, which expresses, as we mentioned, the strong sensitivity of the heat shock system to a 1°
C temperature change. A strong temperature dependence is also present in [
2], where the ratio
Vm, k/
Vm, p changes five orders of magnitudes for a temperature change of 4°
C.
For consistency, the last N system in the NLN cascade will be modeled also using an exponential function, although it must be multiplied by a linear function to express the transition from mild to severe stress response. As a general approach, we also aimed to use as few parameters as possible to construct each of the kernel functions of the NLN cascade.
Below we explain the building of the NLN model out of experimental data, together with the logic we used at each step and the tests we applied to check the modeling process.
We stressed the CHO cells with different heat shocks profiles. For linear systems, and from a theoretical point of view, it will suffice to use a Dirac-δ input. However, we cannot impose too high of a temperature on cells. So, we need to use a set of one pulse shocks, (T, D) of variable temperature T and duration D. The goal is thus to find b(T, D) as a function of T and D.
Because the system is nonlinear, the response to one shock will not reveal its nature. For this reason we extend the set of input shocks to contain also double shocks of different temperature levels and durations. For a double shock input, (T1, D1) followed by (T2, D2), the b will depend on all four variables, b(T1, T2, D1, D2). We will only study the case for which the time between the shocks is negligible, a point that will be elaborated in more detail in connection with the definition of thermotolerance. For all double shocks, the time between shocks was about two minutes which was necessary for replacing the media used for the first shock temperature with the media for the second shock temperature. The replacement included a centrifugal spinning at the end of the first shock, followed by cell resuspension into the second shock media. The shocks from a two shock sequence are thus distinct, with a short time interval in between. Details on shock delivery are described in Material and Methods. The reason that b is seen either as a function of two, (T, D) or four,(T1, T2, D1, D2), variables is because we regard b as a function of the input heat shock, b(Heat Shock). The b(T, D) and b(T1, T2, D1, D2) functions appear as a result of the composition of the function b(Heat Shock) with the function Heat Shock(T, D) and, Heat Shock(T1, T2, D1, D2), respectively.
If we can find the function
C(
x,
y) that describes the composition law
b(
T1,
T2,
D1,
D2) =
C(
b(
T1,
D1),
b(
T2,
D2)), we can find the cell's response to three or more adjacent shocks, under the natural hypothesis that the composition law is associative. We thus limit our input stresses to single and double shocks. In what follows we explain a series of 261 experimental conditions and the theoretical reasonings that led us to propose a model for the transient activity (the b-values) as a function of the input stress. The question about the mathematical model that describes the function
b(
T,
D) was left open in [
5]. The only reference about
b(
T,
D) was Figure in [
5], which is corrected in the present study by removing the saddle point. The improvement is due to an increase in the number and the structure of the shocks from 48 in [
5] to 261 in the present study.