The recurrent network that forms the basis of our studies is a conventional model in which the outputs of individual neurons are characterized by firing rates and neurons are sparsely interconnected through excitatory and inhibitory synapses of various strengths (Methods). Following ideas developed in the context of liquid-state (Maass et al., 2002
) and echo-state (Jaeger, 2003
) models, we assume that this basic network is not designed for any specific task, but is instead a general purpose dynamical system that will be co-opted for particular applications through subsequent synaptic modification. As a result, the connectivity and synaptic strengths of the network are chosen randomly (Methods). For the parameters we use, the initial state of the network is chaotic ().
Figure 2 FORCE learning in the network of . A-C) The FORCE training sequence. Network output, z, is in red, the firing rates of 10 sample neurons from the network are in blue and the orange trace is the magnitude of the time derivative of the readout (more ...)
To specify a task for the networks of , we must define their outputs. In a full model, this would involve simulating activity all the way out to the periphery. In the absence of such a complete model, we need to have a way of describing what the network is “doing”, and here we follow another suggestion from the liquid- and echo-state approach (Maass et al., 2002
; Jaeger, 2003
; see also Buonomano and Merzenich, 1995
). We define the network output through a weighted sum of its activities. Denoting the activities of the network neurons at time t
by assembling them into a column vector r
) and the weights connecting these neurons to the output by another column vector w
, we define the network output as
Multiple readouts can be defined in a similar manner, each with its own set of weights, but we restrict the discussion to one readout at this point. Although a linear readout is a useful way of defining what we mean by the output of a network, it should be kept in mind that it is a computational stand-in for complex transduction circuitry. For this reason, we refer to the output-generating element as a unit rather than a neuron, and we call the components of w weights rather than synaptic strengths.
Having specified the network output, we can now define the task we want the network to perform, which is to set z(t) = f (t) for a pre-defined target function f (t). In most of the examples we present, the goal is to make a network produce the target function in the absence of any input. Later, we consider the more conventional network task of generating outputs that depend on inputs to the network in a specified way. Due to stability issues, this is an easier task than generating target functions without inputs, so we mainly focus on the no-input case.
In the initial instantiation of our model (), we follow Jaeger and Haas (2004)
and modify only the output weight vector w
. All other network connections are left unchanged from their initial, randomly chosen values. The critical element that makes such a procedure possible is a feedback loop that carries the output z
back into the network (). Learning cannot be accomplished in a network receiving no external input without including such a loop. The strengths of the synapses from this loop onto the neurons of the network are chosen randomly and left unmodified. The strength of the feedback synapses is of order 1 whereas that of synapses between neurons of the recurrent network is of order 1 over the square root of the number of recurrent synapses per neuron. The feedback synapses are made stronger so that the feedback pathway has an appreciable effect on the activity of the recurrent network. Later, when we consider the architectures of , we will no longer need such strong synapses.
Training in the presence of the feedback loop connecting the output in back to the network is challenging because modifying the readout weights produces delayed effects that can be difficult to calculate. Modifying w
has a direct effect on the output z
given by equation 1
, and it is easy to determine how to change w
to make z
closer to f
through this direct effect. However, the feedback loop in gives rise to a delayed effect when the resulting change in the output caused by modifying w
propagates repeatedly along the feedback pathway and through the network, changing network activities. Because of this delayed effect, a weight modification that at first appears to bring z
closer to f
may later cause it to deviate away. This problem of delayed effects arises when attempting to modify synapses in any recurrent architecture, including those of .
As stated in the Introduction, Jaeger and Haas (2004)
eliminated the problem of delayed effects by clamping feedback during learning. In other words, the output of the network, given by equation 1
was compared with f
to determine an error that controlled modification of the readout weights, but this output was not fed back to the network during training. Instead the feedback pathway was clamped to the target function f
. The true output was only fed back to the network after training was completed.
We take another approach, which does not require any clamping or manipulation of the feedback pathway, it relies solely on error-based modification of the readout weights. In this scheme, we allow output errors to be fed back into the network, but we keep them small by making rapid and effective weight modifications. As long as output errors are small enough, they can be fed back without disrupting learning, i.e. without introducing significant delayed, reverberating effects. Because the method requires tight control of a small (first-order) error, we call it First-Order, Reduced and Controlled Error or FORCE learning. Although the FORCE procedure holds the feedback signal close to its desired value, it does not completely clamp it. This difference, although numerically small, has extremely significant implications for network stability. Small differences between the actual and desired output of the network during training allow the learning procedure to sample instabilities in the recurrent network and stabilize them.
A learning algorithm suitable for FORCE learning must rapidly reduce the magnitude of the difference between the actual and desired output to a small value, and then keep it small while searching for and eventually finding a set of fixed readout weights that can maintain a small error without further modification. A number of algorithms are capable of doing this (Discussion). All of them involve updates to the values of the weights at times separated by an interval Δt. Each update consists of evaluating the output of the network, determining how far this output deviates from the target function, and modifying the readout weights accordingly. Note that Δt is the interval of time between modifications of the readout weights, not the basic integration time step for the network simulation, which can be smaller than Δt .
At time t
, the training procedure starts by sampling the network output, which is given at this point by wT
). The reason that the weights appear here evaluated at time t
is that they have not yet been updated by the modification procedure, so they take the same values that they had at the end of the previous update. Comparing this output with the desired target output f
), we define the error
The minus subscript signifies that this is the error prior to the weight update at time t
. The next step in the training process is to update the weights from w
) to w
) in a way that reduces the magnitude of e-
). Immediately after the weight update, the output of the network is wT
), assuming that the weights are modified rapidly on the scale of network evolution (Discussion). Thus, the error after the weight update is
with the plus subscript signifying the error after the weights have been updated.
The goal of any weight modification scheme is to reduce errors by making |e+(t)| < |e-(t)| and also to converge to a solution in which the weight vector is no longer changing so that training can be terminated. This latter condition corresponding to making e+(t)/e-(t)→1 by the end of training. In most training procedures, these two conditions are accompanied by a steady reduction in the magnitude of both errors (e+ and e-) over time, which are both quite large during the early stages of training. FORCE learning is unusual in that the magnitudes of these errors are small throughout the learning process, although they are similarly reduced over time. This is done by making a large reduction in their size at the time of the first weight update, and then maintaining small errors throughout the training process that decrease with time.
If the training process is initialized at time t = 0, the first weight update will occur at time Δt. A weight modification rule useful for FORCE learning should make |e+(Δt)|, the error after the first weight update has been performed, small, and then keep |e-(Δt)| small while slowly increasing e+(t)/e-(t)→1. Given a small magnitude of e+(t - Δt)), e-(t), which is equal to e+(t - Δt) plus a term of order Δt, is kept small by keeping the updating interval Δt sufficiently short. This means that learning can be performed with an error that starts and stays small.
As stated above, several modification rules meet the requirements of FORCE learning, but the recursive least squares (RLS) algorithm is particularly powerful (Haykin, 2002
), and we use it here (see Discussion and Supplementary Materials for another, simpler algorithm). In RLS modification,
) is an N
matrix that is updated at the same time as the weights according to the rule
The algorithm also requires an initial value for P
, which is taken to be
is the identity matrix and α
is a constant parameter. Equation 4
can be viewed as a standard delta-type rule (that is, a rule involving the product of the error and the presynaptic firing rate), but with multiple learning rates given by the matrix P
, rather than by a scalar quantity. In this algorithm, P
is a running estimate of the inverse of the correlation matrix of the network rates r
plus a regularization term (Haykin, 2002
It is straightforward to show that the RLS rule satisfies the conditions necessary for FORCE learning. First, if we assume that the initial readout weights are zero for simplicity (this is not essential), the above equations imply that the error after the first weight update is
The quantity rTr
is of order N
, the number of neurons in the network, so as long as α
, this error is small, and its size can be controlled by adjusting α
(see below). Furthermore, at subsequent times, the above equations imply that
The quantity rTPr
varies over the course of learning from something close to 1 to a value that asymptotically approaches 0, and it is always positive. This means that the size of the error is reduced by the weight update, as required, and ultimately e+
The parameter α , which acts as a learning rate, should be adjusted depending on the particular target function being learned. Small α values result in fast learning, but sometimes make weight changes so rapid that the algorithm becomes unstable. In those cases, larger α should be used (subject to the constraint α = N), but if α is too large, the FORCE algorithm may not keep the output close to the target function for a long enough time, causing learning to fail. In practice, values from 1 to 100 are effective, depending on the task.
In addition to dealing with feedback, FORCE learning must control the chaotic activity of the network during the training process. In this regard, it is important to note that the network we are considering is being driven through the feedback pathway by a signal approximately equal to the target function. Such an input can induce a transition between chaotic and non-chaotic states (Molgedey et al., 1992
; Bertchinger and Natschläger, 2004
; Rajan et al., 2008
). This is how the problem of chaotic activity can be avoided. Provided that the feedback signal is of sufficient amplitude and frequency to induce a transition to a non-chaotic state (the required properties are discussed in Rajan et al., 2008
), learning can take place in the absence of chaotic activity, even though the network is chaotic prior to learning.
Examples of FORCE Learning
illustrates how the activity of an initially chaotic network can be modified so that it ends up producing a periodic, triangle-wave output autonomously. Initially, with the output weight vector w chosen randomly, the neurons in the network exhibit chaotic spontaneous activity, as does the network output (). When we start FORCE learning, the weights of the readout connections begin to fluctuate rapidly, which immediately changes the activity of the network so that it is periodic rather than chaotic and forces the output to match the target triangle wave (). The progression of learning can be tracked by monitoring the size of the fluctuations in the readout weights (orange trace in ), which diminish over time as the learning procedure establishes a set of static weights that generate the target function without requiring modification. At this point, learning can be turned off, and the network continues to generate the triangle wave output on its own indefinitely (). The learning process is rapid, converging in only four cycles of the triangle wave in this example.
FORCE learning can be used to modify networks that are initially in a chaotic state so that they autonomously produce a wide variety of outputs (). In these examples, training typically converges in about 1000τ, where τ is the basic time constant of the network, which we set to 10 ms. This means learning takes about 10 s of simulated time. Networks can be trained to produce periodic functions of different complexity and form (), even when the target function is corrupted by noise (). The dynamic range of the outputs that chaotic networks can be trained to generate by FORCE learning is impressive. For example, a 1000 neuron network with a time constant of 10 ms can produce sine wave outputs with periods ranging from 60 ms to 8 s ().
FORCE learning is not restricted to periodic functions. For example, a network can be trained to produce an output matching one of the dynamic variables of the three-dimensional chaotic Lorenz attractor (Methods, see also Jaeger and Haas, 2004
), although in this case, because the target is itself a chaotic process, a precise match between output and target can only last for a finite amount of time (). After the two traces diverge, the network still produces a trace that looks like it comes from the Lorenz model.
FORCE learning can also produce a segment matching a one-shot, non-repeating target function (). To produce such a one-shot sequence, the network must be initialized properly, and we do this by introducing a fixed-point attractor as well as the network configuration that produces the one-shot sequence. This is done by adding a second readout unit to the network that also provides feedback (Methods, network diagram in ). The first feedback unit induces the fixed point corresponding to a constant z output (horizontal red line in ), and then the second unit induces the target pattern (red trace between the arrows in ). As shown below, initialization can also be achieved through appropriate input.
As discussed above, FORCE learning must induce a transition in the network from chaotic to non-chaotic activity during training. This requires an input to the network, through the feedback loop in our case, of sufficient amplitude. If we try to train a network to generate a target function with too small an amplitude, the activity of the network neurons remains chaotic even after FORCE learning is activated (). In this case, learning does not converge to a successful solution. There are a number of solutions to this problem. It is possible for the network to generate low amplitude oscillatory and non-oscillatory functions if these are displaced from zero by a constant shift. Alternatively, the networks shown in can be trained to generate low amplitude signals centered near zero.
PCA Analysis of FORCE Learning
The activity of a network that has been modified by the FORCE procedure to produce a particular output can be analyzed by principal component analysis (PCA). For a network producing the periodic pattern shown in , the distribution of PCA eigenvalues () indicates that the trajectory of network activity lies primarily in a subspace that is of considerably lower dimension than the number of network neurons. The projections of the network activity vector r(t) onto the PC vectors form a set of orthogonal basis functions () from with the target function is generated. An accurate approximation of the network output (brown trace in ) can be generated using the basis functions derived from only the first 8 principal components (with components labeled in decreasing order of the size of their eigenvalues). These 8 components are not the whole story, however, because, along with generating the target function, the network must be stable. If we express the readout weight vector in terms of its projections onto the PC vectors of the network activity, we find that learning sets about the top 50 of these projections to uniquely specified values (). The remaining projections take different values from one learning trial to the next, depending on initial conditions (). This multiplicity of solutions greatly simplifies the task of finding successful readout weights.
Figure 3 Principal component analysis of network activity. A) Output after training a network to produce a sum of four sinusoids (red), and the approximation (brown) obtained using activity projected onto the 8 leading principal components. B) Projections of network (more ...)
The uneven distribution of eigenvalues shown in illustrates why the RLS algorithm works so well for FORCE learning. As mentioned previously, the matrix P acts as a set of learning rates for the RLS algorithm. This is seen most clearly by shifting to a basis in which P is diagonal. Assuming learning has progressed long enough for P to have converged to the inverse correlation matrix of r, the diagonal basis is achieved by projecting w and r onto the PC vectors. Doing this, it is straightforward to show that the learning rate for the component of w aligned with PC vector a after M weight updates is 1/(Mλa +α), where λa is the corresponding PC eigenvalue. This rate divides the RLS process into two stages, one when M <α/λa in which the major role of weight modification is to control the output (set it close to f), and another when M >α/λa in which the goal is learning, that is, finding a static weight that accomplishes the task. Components of w with large eigenvalues quickly enter the learning phase, whereas those with small eigenvalues spend more time in the control phase (). Controlling components with small eigenvalues allows weight projections in dimensions with large eigenvalues to be learned.
The learning rate for all components during the control phase is 1/α
. During the learning phase, the rate for PC component a
is proportional to 1/λa
. The average rate of change (as opposed to just the learning rate) of the projection of the output weight vector onto principal component a
is proportional to
because the factor of r
in equation 4
introduces a term proportional to
, so the full rate of change for large M
. This is exactly what it should be, because in the expression for z
, this change is multiplied by the projection of r
onto PC vector a
, which again has an amplitude proportional to
. Thus, RLS, by having rates of change of w
in the PC basis, allows all the projections to, potentially, contribute equally to the output of the network.
Comparison of Echo-State and FORCE Feedback
In echo-state learning (Jaeger and Haas, 2004
), the feedback signal during training was set equal to the target function f
). In FORCE learning, the feedback signal is z
) during training. To compare these two methods, we introduce a mixed feedback signal, setting the feedback equal to γ f
) + (1-γ
) during training. Thus, γ
= 0 corresponds to FORCE learning and γ
=1 to echo-state learning, with intermediate values interpolating between these two approaches.
Training to produce the output of , we find the network is only stable on the majority of trials when γ
< 0.15, in other words close to the FORCE limit (). Furthermore, in this γ
range, the error in the output after training increases as a function of γ
, meaning γ
= 0 performs best (). For a typical instability of pure echo-state learning, the output matches the target briefly after learning is terminated, but then it deviates away (). Because this stability problem arises from the failure of the network to sample feedback fluctuations, it can be alleviated somewhat by introducing noise into the feedback loop during training (Jaeger and Haas, 2004
, introduced noise into the network, which is less effective). Doing this, we find that pure echo-state learning converges on about 50% of the trials, but the error on these is significantly larger than for pure FORCE learning.
Figure 4 Comparison of different mixtures of FORCE (γ = 0 and echo-state (γ =1) feedback. A) Percent of trials resulting in stable generation of the target function. B) Mean absolute error (MAE) between the output and target function after learning (more ...)
Advantages of Chaotic Spontaneous Activity
To study the effect of spontaneous chaotic activity on network performance, we introduce a factor g
that scales the strengths of the recurrent connections within the network. Networks with g
<1 are inactive prior to training, whereas networks with g
>1 exhibit chaotic spontaneous activity (Sompolinsky et al., 1988
) that gets more irregular and fluctuates more rapidly as g
is increased beyond 1 (we typically use g
The number of cycles required to train a network to generate the periodic target function shown in drops dramatically as a function of g , continuing to fall as g gets larger than 1 (). The average root-mean-square (RMS) error, indicating the difference between the target function and the output of the network after FORCE learning, also decreases with g (). Another measure of training success is the magnitude of the readout weight vector |w| (). Large values of |w| indicate that the solution found by a learning process involves cancellations between large positive and negative contributions. Such solutions tend to be unstable and sensitive to noise. The magnitude of the weight vector falls as a function of g and takes its smallest values in the region g > 1 characterized by chaotic spontaneous activity.
Figure 5 Chaos improves training performance. Networks with different g values (Methods) were trained to produce the output of . Results are plotted against g in the range 0.75 < g <1.56, where learning converged. A) Number of cycles of (more ...)
These results indicate that networks that are initially in a chaotic state are quicker to train and produce more accurate and robust outputs than non-chaotic networks. Learning works best when g
> 1 and, in fact, fails in this example for networks with g
> 0.75. This might suggest that the larger the g
value the better, but there is an upper limit. Recall that FORCE learning does not work if the feedback from the readout unit to the network fails to suppress the chaos in the network. For any given target function and set of feedback synaptic strengths, there is an upper limit for g
beyond which chaos cannot be suppressed by FORCE learning. Indeed, the range of g
values in terminates at g
= 1.56 because learning did not converge for higher g
values due to this problem. Thus, the best value of g
for a particular target function is at the “edge of chaos” (Bertchinger and Natschläger, 2004
), that is the g
value just below the point where FORCE learning fails to suppress chaotic activity during training.
Distorted and Delayed Feedback
The linear readout unit was introduced into the network model as a stand-in for a more complex, un-modeled peripheral system, in order to define the output of the network. The critical information provided by the readout unit is the error signal needed to guide weight modification, so its biological interpretation should be as a system that computes or estimates the deviation between an action generated by a network and the desired action. However, in the network configuration presented to this point (), the readout unit, in addition to generating the error signal that guides learning, is also the source of feedback. Given that the output in a biological system is actually the result of a large amount of nonlinear processing and that feedback, whether proprioceptive or a motor efference copy, may have to travel a significant distance before returning to the network, we begin this section by examining the effect of introducing delays and nonlinear distortions along the feedback pathway from the readout unit to the network neurons.
The FORCE learning scheme is remarkably robust to distortions introduced along the feedback pathway (). Nonlinear distortions of the feedback signal can be introduced as long as they do not diminish the temporal fluctuations of the output to the point where chaos cannot be suppressed. We have also introduced low-pass filtering into the feedback pathway, which can be quite extreme before the network fails to learn. Delays can be more problematic if they are too long. The critical point is that FORCE learning works as long as the feedback is of an appropriate form to suppress the initial chaos in the network. This means that the feedback really only has to match the period or the duration of the target function and roughly have the same frequency content.
Figure 6 Feedback variants. A) Network trained to produce a periodic output (red trace) when its feedback (cyan trace) is 1.3tanh(sin(πz(t -100 ms)), a delayed and distorted function of the output z(t) (gray oval in circuit diagram). B) FORCE learning (more ...)
FORCE Learning with Other Network Architectures
Even allowing for distortion and delay, the feedback pathway, originating as it does from the linear readout unit, is a non-biological element of the network architecture of . To address this problem, we consider two ways of separating the feedback pathway from the linear readout of the network and modeling it more realistically. The first is to provide feedback to the network through a second neural network () rather than via the readout unit. To distinguish the two networks, we call the original network, present in , the generator network and this new network the feedback network. The feedback network has nonlinear, dynamic neurons identical to those of the generator network, and is recurrently connected. Each unit of the feedback network produces a distinct output that is fed back to a subset of neurons in the generator network, so the task of carrying feedback is shared across multiple neurons. This repairs two biologically implausible aspects of the architecture of : the strong feedback synapses mentioned above and the fact that every neuron in the network receives the same feedback signal.
When we include a feedback network (), FORCE learning takes place both on the weights connecting the generator network to the readout unit (as in the architecture of ) and on the synapses connecting the generator network to the feedback network (red connections in ). Separating feedback from output introduces a credit-assignment problem because changes to the synapses connecting the generator network to the feedback network do not have a direct effect on the output. To solve this problem within the FORCE learning scheme, we treat every neuron subject to synaptic modification as if it were the readout unit, even when it is not. In other words, we apply equations 4
to every synapse connecting the generator network to the feedback network (Supplementary Materials), and we also apply them to the weights driving the readout unit. When we modify synapses onto a particular neuron of the feedback network, the vector r
in these equations is composed of the firing rates of generator network neurons presynaptic to that feedback neuron, and the weight vector w
is replaced by the strengths of the synapses it receives from these presynaptic neurons. However, the same error term that originates from the readout (equation 2
) is used in these equations whether they are applied to the weights of the readout unit or synapses onto neurons of the feedback network (Methods). The form of FORCE learning we are using is cell autonomous, so no communication of learning-related information between neurons is required to implement these modifications, except that they all use a global error signal.
FORCE learning with a feedback network and independent readout unit can generate complex outputs similar to those in , although parameters such as α
) may require more careful adjustment. After training, when the output of these networks matches the target function, the activities of neurons in the feedback network do not, despite the fact that their synapses are modified by the same algorithm as the readout weights (). This difference is due to the fact that the feedback network neurons receive input from each other as well as from the generator network, and these other inputs are not modified by the FORCE procedure. Differences between the activity of feedback network neurons and the output of the readout unit can also arise from different values of the synapses and the readout weights prior to learning.
With a separate feedback network, the feedback to an individual neuron of the generator network is a random combination of the activities of a subset of feedback neurons, summed through random synaptic weights. While these sums bear a certain resemblance to the target function, they are not identical to it nor are they identical for different neurons of the generator network. Nevertheless, FORCE learning works. This extends the result of , showing not only that the feedback does not have to be identical to the network output, but that it does not even have to be identical for each neuron of the generator network.
Why does this form of learning, in which every neuron with synapses being modified is treated as if it were producing the output, work? In the example of , the connections from the generator network to the readout unit and to the feedback network are sparse and random (Methods), so that neurons in the feedback network do not receive the same inputs from the generator network as the readout unit. However, suppose for a moment that each neuron of the feedback network, as well as the readout unit, received synapses from all of the neurons of the generator network. In this case, the changes to the synapses onto the feedback neurons would be identical to the changes of the weights onto the readout unit and therefore would induce a signal identical to the output into each neuron of the feedback network. This occurs, even though there is no direct connection between these two circuit elements, because the same learning rule with the same global error is being applied in both cases.
The explanation of why FORCE learning works in the feedback network when the connections from the generator network are sparse rather than all-to-all (as in ) relies on the accuracy of randomly sampling a large system (Sussillo, 2009
). With sparse connectivity, each neuron samples a subset of the activities within the full generator network, but if this sample is large enough, it can provide an accurate representation of the leading principal components of the activity of the generator network that drive learning. This is enough information to allow learning to proceed. For , we used an extremely sparse connectivity (Methods) to illustrate that FORCE learning can work even when the connections of the units being modified are highly non-overlapping.
The original generator network we used () is recurrent and can produce its own feedback. This means that we should be able to apply FORCE learning to the synapses of the generator network itself, in the arrangement shown in . To implement FORCE learning within the generator network (Supplementary Materials), we modify every synapse in that network using equations 4
. To apply these equations, the vector w
is replaced by the set of synapses onto a particular neuron being modified, and r
is replaced by the vector formed from the firing rates of all the neurons presynaptic to that network neuron. As in the example of learning in the feedback network, FORCE learning is also applied to the readout weights, and the same error, given by equation 2
, is used for every synapse or weight being modified.
FORCE learning within the network can produce a complex target output (). An argument similar to that given for learning within the feedback network can be applied to FORCE learning for synapses within the generator network. To illustrate how FORCE learning works, we express the total current into each neuron of the generator network as the sum of two terms. One is the current produced by the original synaptic strengths prior to learning,
for neuron i
. The other is the extra current generated by the learning-induced changes in these synapses,
. The first term, as well as the total current, is different for each neuron of the generator network because of the random initial values of the synaptic strengths. The second, learning-induced current, however, is virtually identical to the target function for each neuron of the network (, cyan). Thus, FORCE learning induces a signal representing the target function into the network, just as it does for the architecture of , but in a subtler and more biologically realistic manner.
Output patterns like those in can be reproduced by FORCE learning applied within the generator or feedback networks. In the following sections, we illustrate the capacity of these forms of FORCE learning while, at the same time, introducing new tasks. All of the examples shown can be reproduced using all three of the architectures in , but for compactness we show results from learning in the generator network in and learning in the feedback network in . For the interested reader, Matlab files that implement FORCE learning in the different architectures are included with the Supplementary Materials.
Figure 7 Multiple pattern generation and 4-Bit memory through learning in the generator network. A) Network with control inputs used to produce multiple output patterns (synapses and readout weights that are modifiable in red). B) Five outputs (1 cycle of each (more ...)
Figure 8 Networks that generate both running and walking human motions. A) Either of these two network architectures can be used to generate the running and walking motions (modifiable readout weights shown in red), but the upper network is shown. Constant inputs (more ...)
Switching Between Multiple Outputs and Input-Output Mapping with Memory
The examples to this point have involved a single target function. We can train networks with the architecture of in both sparse and fully connected configurations (we illustrate the sparse case) to produce multiple functions, with a set of inputs controlling which is generated at any particular time. We do this by introducing static control inputs to the network neurons () and pairing each desired output function with a particular input pattern (Methods). The constant values of the control inputs are chosen randomly. When a particular target function is being either trained or generated, the control inputs to the network are set to the corresponding static pattern and held constant until a different output is desired. The control inputs do not supply any temporal information to the network, they act solely as a switching signal to select a particular output function. The result is a single network that can produce a number of different outputs depending on the values of the control inputs ().
Up to now, we have treated the network we are studying as a source of what are analogous to motor output patterns. Networks can also generate complex input/output maps when inputs are present. shows a particularly complex example of a network that functions as a 4-bit memory that is robust to input noise. This network has 8 inputs that randomly connect to neurons in the network and are functionally divided into pairs (Methods). The input values are held at zero except for short pulses to positive values that act as ON and OFF commands for the 4 readout units. Input 1 is the ON command for output 1 and input 2 is its OFF command. Similarly, inputs 3 and 4 are the ON and OFF commands for output 2, and so on. Turning on an output means inducing a transition to a state with a fixed positive value of 1, and turning it off means switching it to -1. After FORCE learning, the inputs correctly turn the appropriate outputs on and off with little crosstalk between inputs and inappropriate outputs (). This occurs despite the random connectivity of the network, which means that the inputs do not segregate into different channels. This example requires the network to have, after learning, 16 different fixed point attractors, one for each of the 42 possible combinations of the 4 outputs, and the correct transitions between these attractors induced by pulsing the 8 inputs.
A Motion Capture Example
Finally, we consider an example of running and walking based on data obtained from human subjects performing these actions while wearing a suit that allows variables such as joint angles to be measured (also studied by Taylor et al., 2008 using a different type of network and learning procedure). These data, from the CMU Motion Capture Library, consist of 95 joint angles measured over hundreds of time steps.
We implemented this example using all the architectures in in both sparse and fully connected configurations with similar results (we show a sparse example using the architecture of ). Producing all 95 joint angle sequences in the data sets requires that these networks have 95 readout units. For internal learning, subsets of neurons subjected to learning were assigned to each readout unit and trained using the error generated by that unit (Methods). Although running and walking might appear to be periodic motions, in fact the joint angles in the real data are non-periodic. For this reason, we introduced static control inputs to initialize the network prior to starting the running or walking motion. Because we wanted a single network to generate both motions, we also used the control inputs to switch between them, as in . The successfully trained networks produced both motions (; for an animated demo showing all the architectures of see the avi files included with the Supplementary Materials) demonstrating that a single chaotic recurrent network can generate multiple, high-dimensional, non-periodic patterns that resemble complex human motions.