|Home | About | Journals | Submit | Contact Us | Français|
The successful development of neural prostheses requires an understanding of the neurobiological bases of cognitive processes, i.e., how the collective activity of populations of neurons results in a higher level process not predictable based on knowledge of the individual neurons and/or synapses alone. We have been studying and applying novel methods for representing nonlinear transformations of multiple spike train inputs (multiple time series of pulse train inputs) produced by synaptic and field interactions among multiple subclasses of neurons arrayed in multiple layers of incompletely connected units. We have been applying our methods to study of the hippocampus, a cortical brain structure that has been demonstrated, in humans and in animals, to perform the cognitive function of encoding new long-term (declarative) memories. Without their hippocampi, animals and humans retain a short-term memory (memory lasting approximately 1 min), and long-term memory for information learned prior to loss of hippocampal function. Results of more than 20 years of studies have demonstrated that both individual hippocampal neurons, and populations of hippocampal cells, e.g., the neurons comprising one of the three principal subsystems of the hippocampus, induce strong, higher order, nonlinear transformations of hippocampal inputs into hippocampal outputs. For one synaptic input or for a population of synchronously active synaptic inputs, such a transformation is represented by a sequence of action potential inputs being changed into a different sequence of action potential outputs. In other words, an incoming temporal pattern is transformed into a different, outgoing temporal pattern. For multiple, asynchronous synaptic inputs, such a transformation is represented by a spatiotemporal pattern of action potential inputs being changed into a different spatiotemporal pattern of action potential outputs. Our primary thesis is that the encoding of short-term memories into new, long-term memories represents the collective set of nonlinearities induced by the three or four principal subsystems of the hippocampus, i.e., entorhinal cortex-to-dentate gyrus, dentate gyrus-to-CA3 pyramidal cell region, CA3-to-CA1 pyramidal cell region, and CA1-to-subicular cortex. This hypothesis will be supported by studies using in vivo hippocampal multineuron recordings from animals performing memory tasks that require hippocampal function. The implications for this hypothesis will be discussed in the context of “cognitive prostheses”—neural prostheses for cortical brain regions believed to support cognitive functions, and that often are subject to damage due to stroke, epilepsy, dementia, and closed head trauma.
Cognitive functions such as language, abstract reasoning, and learning and memory have long been held to represent the most complex operations of the brain. Thus, it is not surprising that cognitive functions also have been the most difficult of brain operations to define in terms of underlying neural function and neural mechanisms. Cognition most often is defined in terms of theoretical constructs, for example, “information” or “recognition,” and operations on those constructs, such as “information processing.” Theoretical approaches to cognition, although often successful at the level of inferred cognitive operations and behavior, have difficulty in bridging the gap to neuronal functions (e.g., postsynaptic potentials, or PSPs; action potentials, or APs or “spikes”) and especially in bridging the gap to mechanisms underlying neuronal function (e.g., presynaptic calcium channel kinetics and neurotransmitter release, receptor-channel kinetics, membrane biophysics, synaptic plasticity, etc.). Without common points of registry for the conceptual hierarchies of a neurobiological framework and any theoretical framework for cognition, it becomes difficult if not impossible, to understand a cognitive process in terms of a corresponding neurobiological process, and vice versa. Although fMRI and other imaging methods hold promise for contributing to the solution of this problem, neither the spatial-temporal resolution, nor the generalizability of these technologies are yet at a level to provide the bridge required.
We propose an operational definition of the neurobiological basis of cognition using a combined experimental/theoretical approach designed to measure cognitive processes directly, and to describe them mathematically. Our approach is based on principles of nonlinear systems identification, first developed in the field of engineering –. We and our colleagues have spent much of the last 30 years adapting these principles to neurobiological systems, and specifically to the hippocampus, a brain region responsible for long-term memory formation . In our approach, each neuron is considered the fundamental operating unit of a given neural system, consistent with the “neuron doctrine” of Ramon y Cajal in the early part of the 20th century . Neurons generate output signals in the form of all-or-none APs that propagate to other neurons (typically tens to hundreds of other neurons) along “axons” that end in specialized contacts known as “synapses” (Fig. 1). Each AP input (a neuron may receive hundreds to thousands of such inputs) generates a synaptic response that can be depolarizing (excitatory postsynaptic current, EPSC, or potential, EPSP) or hyperpolarizing (inhibitory postsynaptic current, IPSC, or potential, IPSP). If inputs to a neuron cause the resting membrane potential (typically −70 mV relative to the extracellular fluid) to depolarize to −55 mV or more, a “threshold” is crossed which results in the generation of an output AP (this number for threshold varies considerably from neuron to neuron, and should be considered very “approximate”).
All of these concepts deserve much more detailed consideration for an understanding of the biophysical properties of neurons and/or fundamental principles of synaptic transmission , . In this paper, however, we will focus on a few elemental concepts that derive from essential properties of neurons and neural networks, and that are key in determining the theoretical and experimental approach used in our research and described here. We wish to first identify these concepts, and then explain how they have shaped our approach to studying neural function at synaptic, neuron, and network levels of organization. We propose that experimental measurements and mathematical modeling of network function, using the formalisms identified, provide the best available direct observation of high-level neural system function, and thus, a real, definable, and available neural counterpart to “cognitive processes.”
Among these elemental concepts is that of “dynamics,” in other words, the fact that the EPSC and EPSP shown in Fig. 1 do not have a single value amplitude, but instead, have an amplitude that evolves over time. EPSPs reflect EPSCs flowing across the resistance and capacitance of the cell membrane. The amplitude–time course of the EPSCs reflect the probabilistic opening and closing of large numbers of channels in the postsynaptic region of the cell membrane; the channels are activated by neurotransmitter released by vesicles of the presynaptic element, and the binding of that neurotransmitter to the postsynaptic receptor. The dynamics of the relation between receptor binding and channel state are typically described with kinetic models. Because the probability of channel opening is large initially, and then gradually decreases, the EPSC and the subsequent EPSP have the shape that they do: a sharp rise followed by an exponential decay. What also is crucial to an understanding of brain function, however, is that any biological neural network consists of a hierarchical organization of dynamical systems –. The dynamics of molecular interactions of receptor and channel subunits determine the dynamics of EPSCs; dynamics of EPSCs determine the dynamics of EPSPs. Within each layer of the hierarchy, elements can be “triggered” or “activated,” but the activity of a given element then evolves largely according to its own internal dynamics.
There also can be interactions between levels in the hierarchy, however. “Internal dynamics” strongly influence the response of a mechanism like an AMPA receptor-channel to an external input; AMPA receptor-channel kinetics are unlikely to change substantially unless there is a genetic mutation of one of its subunits . However, the kinetics of other receptor-channel complexes like the NMDA type, include elements that are voltage-dependent (the voltage-dependent blockade of the NMDA channel by Mg2+ must be relieved by depolarization of the local membrane), and thus, are influenced by a property of the next higher level in the hierarchy, i.e., the neuron. The transmembrane voltages induced by other inputs surrounding any one NMDA receptor-channel are integrated by the postsynaptic neuron to determine the local membrane voltage. This local voltage at the level of the neuron is the source of a feedback to the lower level of synapses, to shape the amplitude–time course of the NMDA-mediated EPSC (see , , , for a formalism to describe the neural hierarchy).
Another key concept to understanding brain function underlying cognition is the “nonlinearity” of virtually all synaptic and neural mechanisms. What is meant by nonlinearity is straightforward to define, though not so straightforward to measure, and to measure accurately. The definition of nonlinearity, in the context of neural synaptic transmission, is that the response of a postsynaptic neuron to the second of two successive presynaptic stimuli is not predictable by the principle of superposition. Consider the hypothetical examples shown in Fig. 2. The input pulse, when delivered alone (xa), generates an output response, (ya) that exhibits a relatively rapid rise and an exponential decay typical of EPSP-like waveforms. In the second case, (xb, yb), two pulses are delivered with an inter-impulse interval such that the second pulse is delivered before the response to the first pulse is completed. This results in postsynaptic responses that are partially overlapping, and notably, the resulting compound EPSP is not equivalent to a simple summation of the two individual EPSPs. Any such deviation from “superposition” is identified as a “nonlinearity.” In this hypothetical example, the resulting response is more than the summation predicted by superposition, i.e., a “facilitative” second-order nonlinearity; a response less than that predicted by superposition is identified as a “suppressive” second-order nonlinearity. Importantly, observable overlap between responses to successive inputs is not required for the generation of nonlinearities. The observable response to a given input may be completed (i.e., the response returns to baseline), but that input event may also have initiated, for example, unobservable activation of biochemical second messenger systems intrinsic to the postsynaptic neuron, and/or excitation of local interneurons that provide feedback to the target cell from which recordings are obtained. The effects of these secondary inputs may not be observable until expressed in the context of another direct synaptic input (see Fig. 2, and ).
The first two examples (second and third pairs of panels) of nonlinearities considered in Fig. 2 are second-order nonlinearities: deviations from linear summation of responses to single impulses. We also can consider summation of second-order nonlinearities, i.e., summation of two responses where: 1) each response is elicited by the second of a pair of stimulations and 2) the response to at least one pair includes a nonlinearity. This possibility is shown in Fig. 2, the bottom three pairs of panels. Panels xb, yb and xc, yc each demonstrate a significant nonlinearity in response to two different interimpulse intervals. When these two facilitative nonlinearities are combined into a triplet, however, any expected summation of the two facilitations instead is revealed as a strong suppression, expressed (xd, yd). We have observed this in our own studies of synaptic transmission in intrinsic hippocampal pathways, though we have yet to conclusively identify the explanation. We hypothesize that the first two pulses of the triplet activate a second messenger system which, in turn, hyperpolarizes the cell membrane, e.g., through activation of a Ca2+-activated K+ conductance.
Before this discussion proceeds much further, we should pause to provide a mathematical framework useful for defining and quantitatively measuring nonlinearities. To be brief (see , , and , for more complete discussions of the fundamentals related to nonlinearities of biological systems), the work of Volterra , Wiener , Marmarelis , , and others (see Marmarelis reviews , ) has established that for any nonlinear, time-invariant (stationary) system with finite memory, the system output y can be represented as a functional power series of the input x, as in the single-input, single-output, discrete-time case
In this formulation, the system dynamics are expressed by the temporal convolutions of the input and the Volterra kernel functions k; the system nonlinearity is expressed in the form of multiple convolutions of the input and the higher order (above first order) kernel functions. Kernel functions k thus represent the input–output nonlinear dynamics of the system. The zeroth-order kernel, k0, is the value of the output when the input is absent, i.e., spontaneous activity. The first-order kernel, k1, describes the linear dynamic relation between the input and the output, as a function of the time interval (τ) between the present time and past time. The second-order kernel, k2, describes the second-order pairwise nonlinear dynamic relation between x and y. The third-order kernel, k3, describes the third-order triplet-wise nonlinear dynamic relation between x and y, and so on. Higher order kernels, e.g., the fourth-order kernel, are not shown in this equation. The formal relation between the Volterra kernels and the single-pulse, paired-pulse, triple-pulse responses shown in Fig. 2 will be described more fully in Section II [(30), (31)].
Cellular mechanisms exhibiting second- and third-order nonlinearities are common throughout the nervous system. It is fair to state that the great majority of mechanisms underlying nervous system functionality exhibit strong second-order nonlinearities, with third and higher order nonlinearities being common rather than rare. Examples of second- and third-order nonlinearities for hippocampal EPSP recordings already have been shown in Fig. 2. Note that the strong facilitation of EPSP amplitude to the second pulse of the triplet almost certainly reflects residual calcium accumulation presynaptically . The first pulse activates voltage-dependent calcium channels located presynaptically; the resulting calcium entry binds with a family of presynaptic molecules to initiate integration of neurotransmitter-containing vesicles to the presynaptic membrane, and the subsequent release of neurotransmitter into the presynaptic cleft. The time course for removal of free calcium from the presynaptic, intracellular space is approximately 50 ms . If a second pulse activates the same presynaptic fibers within that time period, the calcium entry caused by the second pulse will sum with the residual calcium from the first pulse, resulting in a larger amount of neurotransmitter released and thus a larger postsynaptic response.
Note that in Fig. 2, the suppression of the response to the third pulse of the triplet does not represent a “ceiling effect” (or saturation): the response amplitude to the third pulse in the triplet is substantially less than the large amplitude response to the second pulse. Instead, the first two pulses of the triplet initiate intracellular mechanisms and/or feedback circuitry that actively suppress the glutamate-induced depolarization , . In intracellular studies conducted previously –, it was demonstrated experimentally that the majority of the second-order suppression is induced by GABA-mediated inhibition acting through type A and type B receptor subtypes.
Many of the electrical stimulation protocols that are commonly used to elicit characteristic response profiles from target cells, or to reveal particular currents, provide additional insights into the mechanisms underlying non-linearities. For example, T-type calcium currents are sometimes studied by slightly hyperpolarizing the neuron cell membrane (to bring the majority of channels out of inactivation) and, while in that hyperpolarized state, delivering a depolarization (approximately 10 mV) . These requirements for activating T-type channels would suggest third-order nonlinearities emerging from the requirements of excitatory input delivered following previous excitation; the first excitation must be sufficient to induce GABAergic inhibitory feedback, and the second excitatory barrage must occur within a specific time window to avoid desensitization of the T-type channels. Other calcium channels, e.g., the N-type and the L-type, require different conditions for their activation. Near-selective activation of L-type channels requires a period (e.g., 50 ms) of depolarization (from rest, −75 mV to approximately 0 mV), followed by an additional depolarization (e.g., to +20 mV). L-type calcium current then will continue to flow provided the depolarization remains, given that L-type calcium channels exhibit little-to-no desensitization. It is difficult to estimate a priori the degree of nonlinearity associated with L-type calcium channel dynamics, but it certainly would be at least of third order, and may extend across two or more orders of nonlinearity. N-type calcium channel dynamics will lie somewhere between those of the T-type and the L-type.
From these examples, we hope that at least some principles have become clear. Namely, kernels and input–output models in general, provide a different arsenal of measures for looking at the same neurobiological mechanisms examined with other analytical tools used in the neurosciences , , . Most of the other methods and approaches, which we will term here “mechanistic,” emphasize products of the reductionist approach: analysis and properties of a single mechanism, studied while isolated from the myriad of other mechanisms with which that target mechanism usually interacts. Kernel functions and the class of input–output models discussed here emphasize interactions between mechanisms. Given that cognitive processes must derive from systems-level dynamics, we would argue that input–output modeling is an essential component of any attempt to link cognition to neurobiological mechanisms.
Finally, input–output modeling has sometimes been called a “black box” approach, based on an assumption that practitioners of the approach do not know the box contents, i.e., the neurobiological mechanisms underlying the dynamics being modeled. This assumption is ludicrous on its face, of course. Our input–output modeling of hippocampus, and input–output modeling of other systems like the retina , have been accomplished with the same knowledge of the underlying circuitry, synaptic organization, and pharmacology as studies done on the same systems with mechanistic approaches. In fact, our studies of the role of GABAergic interneurons in second- and third-order nonlinearities of hippocampal dentate granule cells were guided by pharmacologically induced changes in intracellularly recorded membrane potentials –. Changes in the kernel functions occurred in response to interstimulus intervals and pairs of interstimulus intervals matching the time constants of GABAA and GABAB receptor kinetics, with drug-induced changes being specific for GABAergic receptor agonists and antagonists—in other words, the input–output studies used techniques, procedures and criteria nearly identical with mechanistic analyses. In the end, however, kernel analyses reveal more about the total system functionality, both because of the effects of broad-band input stimulation (activates many more mechanisms than traditional single-pulse, paired-pulse, or constant frequency stimulation), and because the formalism itself forces data interpretation and problem specification in terms of a neural network or neural systems level of analysis. For example, Fig. 3(a) shows a box diagram of the dentate gyrus in the intact rat, making explicit the relation between dentate granule cells (the principal neurons of the dentate), and many of the known pathways providing feedforward and feedback regulation of granule cells in response to excitation of input from the entorhinal cortex (similar feedback pathways for CA3 and CA1 are not shown). In the context of a continuous (average interimpulse interval: 500 ms), random interval (interval range: 1–5000 ms) impulse train stimulation of excitatory entorhinal input, it can be seen that granule cells will be activated monosynaptically, but also will be stimulated multisynaptically through the commissural, GABAergic, and other feedforward and feedback pathways intrinsic to the dentate. An equivalent representation for the pathways included in the hippocampal slice is shown in Fig. 3(b); the system can be reduced further with pharmacological blockade of GABAergic receptors. In this manner, the underlying anatomical pathways that contribute to dentate granule cell nonlinear dynamics can be readily identified, and used for interpretation of associated input-output models. Thus, input–output modeling is only as “black box,” or uninterpretable, as the user. Recently, the relation between input–output models and mechanistic models has been formalized, and we have shown how both approaches can be used in a complementary manner , , .
The range of different dynamics found in the nervous system, and the magnitude and higher orders of nonlinearity found for those mechanisms studied to date, provide for considerable complexity of temporal pattern encoding. The degree of complexity increases even further when we consider that the dynamics being discussed to this point are not always constant, but instead, can change over time, or are “nonstationary.” The learning and adaptive capabilities of the vertebrate and invertebrate nervous systems are well established. In addition, the last four decades of neuroscience research have seen experimental identification of a wealth of long-term, permanent changes in cellular and synaptic mechanisms that are induced by the learning process. All of this evidence has shown that learning and memory do not involve “out of the ordinary” mechanisms that are reserved only for learning and memory, and that remain hidden and unexploited until environmental circumstances demand their amalgamation and use. In general, the mechanisms involved in learning and memory are the same mechanisms that underlie the biophysics and synaptic transmission of neurons in day-today circumstances: learning and memory simply require more of mechanism x or less of mechanism y. Given that the effects of mechanisms x and y are captured by, and contribute to, the kernels under nonlearning conditions, we should expect to see a relatively smooth change in system dynamics during the course of learning, i.e., there should not be a sudden and abrupt incorporation into the system of a radically different set of mechanisms, which would be reflected by a sudden and abrupt change in system nonlinearities. Although not an optimal test of this hypothesis, the above is precisely what we observed with the induction of long-term potentiation (LTP). The induction of LTP was accompanied by a smooth and gradual change in pre-LTP second- and third-order nonlinearities . With regard to the Volterra kernel expressions introduced earlier, it thus is reasonable to incorporate cellular plasticity and learning and memory, or nonstationarities, simply by having the kernel expressions become a function of time, t, in addition to remaining a function of τ, the time since a prior input pulse
Of course, there are more neurobiological processes than those underlying learning and memory that change as a function of time, and thus, would be reflected by nonstationarity of neural system kernels. Both the noradrenergic and the serotonergic neurotransmitter systems provide a widely dispersed input to much of the forebrain, thalamus, brainstem, and spinal cord. Both of these systems also change their levels of activity markedly during the sleep–wake cycle, with experiments demonstrating that the actions of norepinephrine and serotonin can significantly alter the responsiveness of recipient neurons to other, nonnoradrenergic and nonserotonergic afferents. For example, we have shown previously that large magnitude changes in noradrenergic levels in hippocampus are associated acutely with substantial changes in second- and third-order nonlinear responsiveness of dentate granule cells to excitatory, glutamatergic input from the perforant path, and inhibitory, GABAergic, input from inhibitory interneurons internal to the dentate gyrus .
Other processes having the longest time constants are likely to be those involved in development of the nervous system. Like changes in kernels representing learning, those representing development will not follow a pattern of deviating from a baseline of system characteristics, and then returning to that standard some predictable period of time later, as should be observed in the case of the dynamics of diurnal cycles. Instead, in the case of development, we would expect nonlinear system characteristics that slowly evolve into progressively richer, more stable, and more different (than the original) sets of system properties. This also allows for the exciting possibility that abnormal developmental and aging states that are difficult to diagnose (e.g., autism, schizophrenia, Alzheimer’s disease) might be identified and differentiated with the new and varied set of quantifiable descriptors represented by the kernels, and which we propose to be capable of reflecting “system properties” of the neural circuitry underlying cognition.
It has been demonstrated, particularly in cortical systems, that key information guiding trained, intentional behavior is represented in the “ensemble” firing of populations of neurons –, i.e., spatiotemporal patterns of electrophysiological activity. The advent of multi-channel single-cell recording has provided the capability for simultaneously observing the firing of tens to hundreds of neurons, so that higher level analyses of the collective relations among subpopulations of neurons can be conducted , –. This has allowed confirmation of earlier suggestions of collective, ensemble activity in results from single cell recording studies .
How should this collective activity of subpopulations of neurons be interpreted in terms of cognitive processing? Clearly, when a subpopulation of neurons achieves and maintains a given spatiotemporal pattern, or a given “relatedness in activity,” and which as a consequence allows for the identification of a relation between that pattern and an external event, it is reasonable to define that spatiotemporal pattern as a “representation” . Representations are transient because neuron firing typically is maintained in one spatiotemporal pattern for only hundreds to thousands of milliseconds (restated, the duration of an identifiable spatiotemporal pattern is typically hundreds to thousands of milliseconds), unless we consider pathological conditions, e.g., rhythmic, cyclical firing characteristic of epilepsy. The latter and physiological rhythmicities, e.g., alpha rhythm, are indicative of “states” rather than the identities of specific external events.
With regard to hippocampus, such representations, or temporarily stable spatiotemporal patterns, could readily map onto individual memories, possibly even individual components of a memory. As the contents of a memory process, temporarily stable spatiotemporal patterns of activity within areas that provide input to hippocampus could constitute “short-term memories.” With representations as “content,” input–output transformations could be considered “process.” Neural systems and brain regions process information by transforming incoming spatiotemporal patterns into different, outgoing spatiotemporal patterns. This statement is not a claim—there simply is no other reasonable interpretation of the basic phenomenology. Thus, information processing underlying cognition involves transformations of neural representations that are dynamic, nonlinear, and often nonstationary (time-varying). While recent advances in multielectrode technology have made it possible to record the simultaneous activities of populations of neurons in behaving animals, modeling such complex system behavior still remains one of the most challenging tasks in computational neuroscience . It is in response to this need that we have invested over 20 years in the development and refinement of a combined experimental-theoretical strategy for quantitatively characterizing, and then modeling, neural systems typical of those routinely found in the mammalian brain.
We formulate here a three-step strategy to model the cognitive function of brain regions in general, and the hippocampus, in particular. In this strategy, we define the cognitive operation of a brain region as the transformation from its input activities to its output activities. Therefore, understanding the cognitive function of a brain region is equivalent to identifying its input–output transfer function S. Since in brain regions, input–output signals are manifested in the form of spatiotemporal patterns of neural spikes, i.e., all-or-none electrical events recorded from individual neurons, all parameters of the transfer function should be derived from the timings of the input/output spikes. The first two steps deal with the stationary and nonstationary aspects of the transfer function, respectively (Fig. 4 left, middle). For the nonstationary case, the third step seeks to identify the “learning rule” underlying the nonstationarity of the transfer function (Fig. 4 right).
During performance of asymptotically learned behavior, a brain region is modeled as a time-invariant system. Its transformational property is modeled as a stationary process. A time-invariant (stationary) system is one whose transfer function does not depend on time. The modeling goal in this step then is to identify the time-invariant transformation S from multiple input spike trains X to the multiple output spike trains Y (3). Since the mechanisms underlying synaptic transmission and generation of spikes in neurons are inherently nonlinear and dynamical, the stationary model has to be a multiple-input, multiple-output (MIMO) nonlinear dynamical model
In our approach, the MIMO model is decomposed into a series multiple-input, single-output (MISO) models (Fig. 5). Each MISO model is then formulated to have both parametric (i.e., mechanistic) and nonparametric (i.e., descriptive) components , . First, the overall model structure is parameterized to be “neuron-like.” It captures the stereotypical features of spiking neurons and explicitly includes variables that can be interpreted as the principal cellular processes such as the postsynaptic potential, the spike-triggered after-potential, the pre-threshold noise, and the spike-generating threshold. This configuration partitions the system nonlinear dynamics in a physiologically realistic manner, and thus facilitates comparison with intracellular recording results. The more versatile features of spiking neurons, i.e., the transformation from the input spikes to postsynaptic potentials and the transformation from the output spike to the after-potential, on the other hand, are modeled nonparametrically with the Volterra series, taking advantages of its flexibility in capturing nonlinear dynamics.
The MISO model structure consists of five components (Fig. 5): 1) a feedforward block K transforming the input spike trains x to a continuous hidden variable u that can be interpreted as the postsynaptic potential; 2) a feedback block H transforming the preceding output spikes to a continuous hidden variable a that can be interpreted as the after-potential; 3) a noise term ε that captures the system uncertainty caused by both the intrinsic neuronal noise and the unobserved inputs; 4) an adder generating a continuous hidden variable w that can be interpreted as a prethreshold potential; and 5) a threshold function generating output spikes when the value of w crosses θ. The model can be expressed by the following equations:
K takes the form of a Volterra model, in which u is expressed in terms of the inputs x by means of the Volterra series expansion as
The zeroth-order kernel, k0, is the value of u when the input is absent, for example, when there is spontaneous variations in membrane potential first-order kernels, , describe the linear relation between the nth input xn and u, as functions of the time intervals (τ) between the present time and the past time. In other words, for each of the multiple inputs to the system, first-order kernels account for the effects of a single input event (a spike, or action potential) on the system membrane potential output, u, regardless of when those single input events may have occurred in the past, and thus, regardless of any other inputs that may have occurred between the past time designated by a particular (τ) and the present time. Second-order self-kernels describe the second-order nonlinear relation between the nth input, xn, and u, as functions of the two time intervals (τ1, τ2) between the present time and the two respective past times. Thus, second-order kernels account for the modulatory effects of an input event occurring in the past on the system membrane potential output, u, evoked by a second input event occurring in the present, when both events occur on the same input. The previous input pulse may increase the response evoked by the present input, i.e., cause facilitation, or may reduce the response evoked by the present input, i.e., cause suppression. Second-order cross-kernels describe the second-order nonlinear interactions between each unique pair of input events (xn1 and xn2) as they affect u, when each of those pulse events occurs on different inputs. N is the number of inputs. Mk denotes the memory length of the feedforward process. Higher order kernels, e.g., third-and fourth-order kernels, are not shown in this equation, but should be obvious by extension from the explanations above.
Similarly, H takes the form of a first-order Volterra model as in
where h is the linear feedback kernel. Mh is the memory length of the feedback process (note τ starts from 1 instead of 0 to avoid predicting the current output with itself). The noise term ε is modeled as a Gaussian white noise with standard deviation σ.
In summary, what the Volterra representation states is that subthreshold variation in membrane potential for any one neuron can be accounted for by variation in the temporal pattern of past action potentials for any one input, or, variation in the spatiotemporal pattern of past action potentials for a population of inputs to that neuron. In total, with all of its components, the model states that, for a population of neurons (input) that provide synaptic input to a second population of neurons (output), variation in the spatiotemporal pattern of past action potentials for the input neurons predicts the spatiotemporal pattern of action potentials for the output population of neurons. We know from what are now tenants of fundamental neuroscience that, in general, such an input–output relation must be true. Outstanding issues relate more to whether or not such a relationship can be quantified or modeled, and whether or not experimental evidence supports such a model to the extent that it can be used to predict the effects of arbitrary input patterns. We report here that both of the latter questions can be answered in the affirmative.
With the model structure defined as above, the next step is to estimate all model parameters, i.e., feedforward kernels k, feedback kernels h, prethreshold noise standard deviation σ, and threshold θ, from the timings of the input/output spikes. The biggest challenges in Volterra modeling is the large number of open parameters (coefficients) to be estimated, especially in the cases of high dimensional input and high order model. To solve this problem, Laguerre expansion of the Volterra kernels (LEV) and statistical model selection techniques are employed , .
where v are the convolution of input–output spike trains (x and y) and Laguerre basis functions b
, and ch are the sought Laguerre expansion coefficients of , and h, respectively (c0 is equal to k0). The number of basis functions (L) is typically much smaller than the memory length (Mk and Mh), so the total number of coefficients is greatly reduced , .
All model parameters can be estimated using a maximum-likelihood method. The negative log-likelihood function L is
where T is the data length, and P is the probability of generating the recorded output y
Since ε is assumed to be Gaussian, the conditional firing probability intensity function Pf (the conditional probability of generating a spike, i.e., Prob(w ≥ θ|x, k, h, σ, θ) in (13)) at time t can be calculated with the Gaussian error function (integral of Gaussian function) erf
P at time t then can be calculated as
Model coefficients c then can be estimated by minimizing the negative log-likelihood function L
It is shown that this model is equivalent to a generalized linear model (GLM) ,  with inputs and preceding output structured with Volterra models , . For this reason, this model can be termed as generalized Volterra model (GVM) , . Note that u, a, and n are dimensionless variables, so without loss of generality, σ and θ can be set to 1 in estimation, and later restored from the estimated coefficients.
The second step of model estimation involves the selection of optimal subsets of model coefficients. Mathematically, this step is necessary for further reducing the number of model coefficients to avoid overfitting. More importantly, this step identifies the significant inputs (represented by the first- and second-order self-kernels) and nonlinear interactions between inputs (represented by the second-order cross-kernels) of each output neuron and results in more interpretable models . For a given output neuron, the selected input neurons are the ones that have functional connections to the output neuron; the selected (second) cross-kernels indicate the pairs of inputs that exhibit nonlinear summations in the synaptic potential (u) of the output neuron. The statistical model selection procedure involves a forward step-wise model selection method  and a cross-validation method that have been described previously .
The model coefficients ĉ and can be obtained from the estimated Laguerre expansion coefficients, , as in
Feedforward and feedback kernels are then reconstructed as
Threshold θ is equal to one.
The normalized kernels provide an intuitive representation of the system input–output nonlinear dynamics. Single-pulse and paired-pulse response functions (r1 and r2) of each input can be derived as , 
is simply the PSP elicited by a single spike from the nth input neuron; describes the nonlinear effect of pairs of spikes from the nth input neuron that is different from the simple summation of their single PSPs, i.e., . represents the nonlinear effect of pairs of spikes with one spike from neuron n1 and one spike neuron n2. h represents the output spike-triggered after-potential (Fig. 6).
The cross-validation procedure in model selection guarantees the resulting model to have predictive power over novel datasets since the out-of-sample likelihood function has to be decreased during model selection . Selected inputs/cross-terms and estimated parameters/coefficients can be readily used to make further inferences about the functional connectivity and neuronal dynamics as shown in the previous section. However, one also needs to evaluate quantitatively the goodness-of-fit of the model. One way of doing this is to evaluate the continuous firing probability intensity predicted by the model with the recorded output spike train. According to the time-rescaling theorem, an accurate model should generate a conditional firing intensity function Pf that can rescale the recorded output spike train into a Poisson process with unit rate . By further variable conversion, interspike intervals should be rescaled into independent uniform random variables on the interval (0, 1). The model goodness-of-fit then can be assessed with a Kolmogorov-Smirnov (KS) test. If the model is correct, all points should lie closely to (e.g., within the 95% confidence bounds) the 45-degree line of the KS plot. Another way is to quantify the similarity between the recorded output spike train y and the predicted output spike train ŷ after a smoothing process. First, ŷ is realized through simulation. Secondly, ŷ and y are convolved with a Gaussian kernel and then compared by calculating their correlation coefficient .
This method has been successfully implemented in the modeling of hippocampal CA3-CA1 dynamics , . In the hippocampus, CA1 pyramidal neurons are primarily driven by CA3 pyramidal cells. Output of the CA1 region thus can be considered a nonlinear transformation of the CA3 spike trains. In Drs. Sam Deadwyler and Hampson’s laboratories at Wake Forest University, rats are trained to perform a memory-dependent behavioral task—delayed nonmatch-to-sample task. CA3 and CA1 spike trains are simultaneously recorded when the rats are performing the task, and then used to build the MIMO model (Fig. 7). Results show that the MIMO model can be reliably estimated from the CA3 and CA1 spike trains. The model: a) accurately (but stochastically) predicts the CA1 spatiotemporal pattern based on the CA3 spatiotemporal pattern (Fig. 8); b) provides intuitive representations of the CA3-CA1 transfer function in means of feedforward kernels, feedback kernels, noise standard deviation; and c) reveals the functional CA3-CA1 connectivity with its significant model terms (see ,  for more details).
Our modeling approach also must deal with the non-stationarities of hippocampal regions. In a nonstationary (time-varying) system, the input–output transfer function depends also on time (32). The modeling goal is to track the emergence and evolution of the MIMO nonlinear dynamics during learning and memory formation.
We have formulated a nonstationary modeling methodology for the above-described model structure using a point-process adaptive filtering framework . In this approach, model coefficients (c) are taken as state variables while the input–output spikes are taken as observable variables. Using adaptive filtering methods, state variables can be recursively updated as the observable variables unfold in time. The underlying change of system input–output properties then is represented by the time-varying Volterra kernels (k(t) and h(t)) reconstructed with the time-varying coefficients (c(t)).
Firstly, the probability of observing an output spike at time t, i.e., Pf (t), is predicted by the GVM at time t − 1 based on the inputs up to t and output before t (14). Secondly, the difference between Pf (t) and the new observation of output y(t) is used to correct the GVM model coefficients. Using the stochastic state point process filtering algorithm , coefficient vector C(t) and its covariance matrix W(t) are both updated iteratively at each time step t
where Q is the coefficient noise covariance matrix. Including W as the “learning rate” allows reliable and rapid tracking of the model coefficients C representing the system nonlinear dynamics.
We have intensively tested this nonstationary algorithm with synthetic input–output spike train data obtained through simulations . The tested systems have various model structures involving different model orders, e.g., first and second order (including self-and cross-kernels). The number of system inputs ranges from moderate to larger scale (e.g., 32-input), which matches the maximal number of available units in a typical experimental dataset. The system nonstationarity to be tracked takes a variety of forms such as: a) step (jump) change; b) linear change; and c) LTP/LTD-like changes. Results show that the nonstationary algorithm can reliably and accurately track the underlying system nonstationarities and represents them in the time-varying Volterra kernels (see Fig. 9 for a second-order, two-input, step-change example). In all cases, the estimated kernels converge rapidly (with a 10–100 s timescale) to the target kernels without interfering each other.
The nonstationarity in the transfer function of a given brain region is determined by the experiences of the animal. In the brain region, the experiences are internally represented as the flow of the input/output spatiotemporal patterns of spike trains. A fundamental question to ask is whether it is possible to reconstruct the nonstationarity of the transfer function of a brain region, which is characterized in Step 2, using its input–output spike trains and a learning rule defining how to modify the transfer function based on the input–output spike trains (Fig. 4 right). Such a learning rule is critical for understanding the underlying mechanisms of cognition processes, e.g., Hebbian-like synaptic modification during learning and memory formation
We postulate to conduct mathematical analyses and computer simulations of neuronal network nonlinear, nonstationary dynamics to identify such potential learning rules. As a first step, a neuronal network model can be built and initialized based on the MIMO nonlinear dynamics identified from naïve animals. The functional connections between input and output neurons will be determined based on the feedforward kernels. The spike-dependent intrinsic properties of the output neurons will be determined by the feedback kernels. In the next step, learning processes in the brain region will be simulated by feeding the network model with the input sequence recorded from the modeled brain region during learning. The input–output transfer function S(t) will be updated by the input and output patterns following a learning rule L. Finally, the learning rule is substantiated through mathematical analyses, and the associated parameters are optimized so that the emergence and changes of the transfer function characterized in Step 2 can be replicated in the simulation. The candidate learning rules include the following.
We expect the final outcome of this step to be a generative model of the identified nonstationarities of the hippocampal population nonlinear dynamics.
In this paper, we have dealt with the issue of the neurobiological bases of cognition. More specifically, we have argued that nonlinear input–output properties of populations of neurons are potential neurobiological indices of cognitive processing. We have demonstrated both here and previously , , , , , , , , ,  that nonlinear input–output properties of single neurons and populations of neurons can be measured experimentally (electrophysiologically), and for mathematical modeling, can be readily incorporated within a theoretical framework of nonlinear systems identification. We also have presented here some of the most recent methodological advances in nonlinear systems modeling that provide the critical capabilities for achieving systems-level descriptors of neural function—systems-level descriptors that can be proposed and investigated as potential correlates of cognitive function. These new methodologies allow input–output properties to be defined in the context of high-order nonlinearities, nonstationarities (synaptic plasticity) of nonlinearities, and population, ensemble coding of neural information.
Before discussing these concepts and approaches in the context of the cognitive function of the hippocampus, it is important to state some assumptions. First, we assume that cognitive functions reflect the highest levels of neural function, i.e., neural operations that involve entire systems of neurons. For example, the cognitive function of creating new long-term memories from existing short-term memories is performed by the hippocampal formation, which is a collection of cortical neural structures consisting of the entorhinal cortex, the dentate gyrus, the CA3 pyramidal cell region (the regio inferior of the hippocampus), the CA1 pyramidal cell region (the regio superior of the hippocampus), and the subiculum , . The hippocampus proper—dentate, CA3, and CA1—is considered the “intrinsic trisynaptic pathway” of the hippocampus, and is the minimum circuitry involved in the short-term memory to long-term memory transformation. Second, we assume that the collective functional properties of the dentate, CA3, and CA1, when combined together, are equivalent to the cognitive function of “long-term memory formation.” Third, we assume that the functional properties of the components of the hippocampus proper identified above, and for that matter most any brain region, can be assessed as “input–output properties,” i.e., the manner in which incoming signals are processed into different, outgoing signals. At a neural level, the composite input–output properties of the major, intrinsic pathways of a brain region are its function. When the kernels are estimated accurately for the appropriate order nonlinearity, and for neural data generated under “natural” conditions, the kernels: 1) describe how the neural system, at each one of its major layers or subsystems, responds to the range of input signals associated with the set of behaviors and/or cognitive states of interest; 2) describe how neural correlates of the behavior of interest (#1) are transformed from the system input to the system output, and at each of its major layers or subsystems; and 3) allow prediction of system and subsystem output for a wide range of activity conditions.
Clinical studies conducted over the last 60 years have clarified that the hippocampal formation is responsible for long-term memory formation , , . The hippocampal system does not store memories itself, but instead, re-encode short-term memory so that information is compatible with existing long-term memory. Precisely what “compatibility” means remains unknown, but as an example, compatibility might mean that appropriate first-order associations for a given episodic memory had been identified. Long-term memory is stored in a distributed manner, probably throughout neocortex. With the hippocampus defined as the set of brain systems above, “long-term memory formation” must be equivalent to the total re-encoding process performed as inputs propagate from the dentate gyrus to the CA1 region. How can this “re-encoding process” be assessed and understood? As stated above, and as demonstrated in previous sections of this paper, we assume that the functional properties of any network of neurons (or for that matter, any neuron, or any neuron component, e.g., channel, etc.) can be represented in terms of its input–output properties, or in this case, its nonlinear multiple-input, multiple-output properties. Given the arguments made earlier, and from data described above, it is our position that neurons should be conceived of as nonlinear dynamical processing elements. Because of the inherent nonlinear properties of hippocampal neurons and the nonlinearities inherent in the processes of synaptic transmission, input spatiotemporal patterns of spike train activity are transformed into different, output spatiotemporal patterns of spike train activity. The nature and degree of nonlinear transformation will almost certainly vary as a function of hippocampal region because of differences in principal cell morphology, and/or intrinsic conductances (e.g., distribution, type of active channels), and/or local circuitry. Nonetheless, as activity propagates from the entorhinal cortex to the subiculum, each layer of the hippocampus (dentate gyrus, CA3, and CA1) progressively reencodes short-term memory representations into long-term memory representations.
The total re-encoding process whereby short-term memories become long-term memories can be assessed experimentally, and modeled mathematically, in the manner demonstrated previously with regard to multi-input, multioutput properties of the CA3-CA1 hippocampal system. If the same analyses were performed for the entorhinal-dentate and the dentate-CA3 subsystems of the hippocampus, then computer simulations of the functioning of all three subsystems of the hippocampus would be attainable. We have investigated previously the possibility of analytical solutions to the combination of subsystem nonlinear characterizations to achieve larger nonlinear system input–output models, and vice versa for system decomposition, but these studies were of single-input, single-output cases only –. Experimental verification of such a simulated model of the functioning of the intrinsic, hippocampal “trisynaptic pathway,” though difficult, is possible (the requirement would be simultaneous recordings of neural activity from two sites in the hippocampal formation separated by two or more synapses, e.g., layer II of the entorhinal cortex and the CA1 pyramidal cell region).
Considering all of the above, we believe it is experimentally and theoretically feasible to characterize each of the subregions of the hippocampus: dentate, CA3, CA1, and then to integrate the dynamics of each layer into a model of the intrinsic, trisynaptic pathway of the hippocampal system, though we are a long way from demonstrating this. The nonlinear transformations of the entire circuit should be equal to the total nonlinear transformations required to convert short-term memory into long-term memory—though this also is an example of an hypothesis that should be tested by such a combined theoretical-experimental approach. The meaning of the transformations of any one layer is unknown, and again, this identifies an important area of future study. Clinical and experimental animal studies have provided compelling clues as to the function of the entire hippocampus, but we have only a few hypotheses as the functional role of each hippocampal subsystem. Input–output studies of each individual component of the hippocampus will provide quantifications of the properties of each of the dentate, CA3, and CA1 fields, and in the process, also provide hints as to subsystem function to which we previously have not had access. The major point, however, is that a combined theoretical-experimental path can be defined for achieving a biologically based, animal model of a highly important cognitive function—long-term memory formation.
The relevance of this approach to neural prostheses is that it follows from the positions argued here that it may be possible for the complexities of higher brain processing related to concept formation, representations, hierarchically organized associations, etc., and potentially even consciousness, i.e., the brain functions least understood in neural terms at present, and most difficult to repair following brain damage, to be represented mathematically as a set of kernels. We have presented an example of such a characterization with modeling of the CA3-CA1 transformation contribution to long-term memory. Such a set of kernels could even be parameterized for context, for example, for the sleep-wake cycle, and as we have shown in previous work, can be reduced to hardware circuitry. What is remarkable about a kernel-based model, in addition to the attributes identified above, is the degree of “compactness” of the input–output relation: all of the mechanisms underlying the highly nonlinear behavior of hippocampal (or other class) neurons, including the contribution of interneurons, and notably, the contribution of unknown mechanisms yet to be discovered, are included in the model, and as shown here, the model in turn can accurately predict system output to arbitrary input patterns. This is a major advantage of our approach compared to, for example, linear or low-order nonlinear models that are the bases of neural prostheses to replace lost upper extremity functionality. We are in the process of testing the hypothesis that kernel functions for the hippocampus can interact with the endogenous tissue to reinstate normal long-term memory capability after hippocampal dysfunction has been induced experimentally. If successful, this experimental-modeling work will lay the foundation for a general strategy to develop neural pros-theses for any one of multiple cognitive functions. Given the availability of such models, additional research investigating the nonlinear transformations of a given brain region with the purported cognitive functions of the same neural system could provide substantial insights into the relations between neural and cognitive dynamics.
This work was supported in part by the National Science Foundation (NSF), in part by the DARPA through the Human-Assisted Neural Devices (HAND) Program, and in part by the National Institutes of Health (NIH) through the National Institute of Biomedical Imaging and BioEngineering (NIBIB) program. D.S. was partially supported by the James H. Zumberge Faculty Research and Innovation Fund at the University of Southern California.
Theodore W. Berger (Fellow, IEEE) received the Ph.D. degree from Harvard University, Cambridge, MA, in 1976; his thesis work received the James McKeen Cattell Award from the New York Academy of Sciences.
He conducted postdoctoral research at the University of California, Irvine from 1977 to 1978, and was an Alfred P. Sloan Foundation Fellow at the Salk Institute from 1978 to 1979. He joined the Departments of Neuroscience and Psychiatry at the University of Pittsburgh in 1979, being promoted through to Full Professor in 1987. Since 1992, he has been Professor of Biomedical Engineering and Neurobiology at the University of Southern California, and was appointed the David Packard Chair of Engineering in 2003. He became Director of the Center for Neural Engineering in 1997, an organization which helps to unite USC faculty with cross-disciplinary interests in neuroscience, engineering, and medicine. He has published over 170 journal articles and book chapters, and is the coeditor of a book recently published by the MIT Press on Toward Replacement Parts for the Brain: Implantable Biomimetic Electronics as Neural Prostheses. His research interests are in the development of biologically realistic, experimentally based, mathematical models of higher brain (hippocampus) function; application of biologically realistic neural network models to real-world signal processing problems; VLSI-based implementations of biologically realistic models of higher brain function; neuron–silicon interfaces for bidirectional communication between brain and VLSI systems; and next-generation brain-implantable, biomimetic signal processing devices for neural prosthetic replacement and/or enhancement of brain function.
Prof. Berger has received a McKnight Foundation Scholar Award, twice received an NIMH Research Scientist Development Award, and was elected a Fellow of the American Association for the Advancement of Science. While at USC, he has received an NIMH Senior Scientist Award, was given the Lockheed Senior Research Award in 1997, was elected a Fellow of the American Institute for Medical and Biological Engineering in 1998, received a Person of the Year “Impact Award” by the AARP in 2004 for his work on neural prostheses, was a National Academy of Sciences International Scientist Lecturer in 2003, and an IEEE Distinguished Lecturer in 2004–2005. He received a “Great Minds, Great Ideas” award from the EE Times in 2005, and in 2006 was awarded USC’s Associates Award for Creativity in Research and Scholarship.
Dong Song (Member, IEEE) received the B.S. degree in biophysics from the University of Science and Technology of China, Hefei, in 1994 and the Ph.D. degree in biomedical engineering from the University of Southern California (USC), Los Angeles, in 2003.
From 2004 to 2006, he worked as a Postdoctoral Research Associate at the Center for Neural Engineering at USC. He is currently a Research Assistant Professor at the Department of Biomedical Engineering at USC. His main research interests include nonlinear systems analysis of nervous system, cortical neural prosthesis, electro-physiology of hippocampus, long-term and short-term synaptic plasticity, and the development of modeling methods incorporating both parametric and nonparametric modeling techniques.
Prof. Song is a member of Biomedical Engineering Society, American Statistical Association, and Society for Neuroscience.
Rosa H. M. Chan (Student Member, IEEE) received the B.Eng degree in automation and computer-aided engineering from the Chinese University of Hong Kong (CUHK), Hong Kong, in 2003. She is currently working toward the Ph.D. degree in the Department of Biomedical Engineering of University of Southern California.
Her research interest is in the development of cortical neural prosthesis.
Ms. Chan was awarded both the Croucher Scholarship and the Sir Edward Youde Memorial Fellowship for overseas study.
Vasilis Z. Marmarelis (Fellow, IEEE) was born in Mytiline, Greece, on November 16, 1949. He received the Diploma degree in electrical engineering and mechanical engineering from the National Technical University of Athens in 1972 and the M.S. and Ph.D. degrees in engineering science (information science and bioinformation systems) from the California Institute of Technology, Pasadena, in 1973 and 1976, respectively.
After two years of postdoctoral work at the California Institute of Technology, he joined the faculty of Biomedical and Electrical Engineering at the University of Southern California, Los Angeles, where he is currently Professor and Director of the Biomedical Simulations Resource, a research center funded by the National Institutes of Health since 1985 and dedicated to modeling/simulation studies of biomedical systems. He served as Chairman of the Biomedical Engineering Department from 1990 to 1996. He is coauthor of the book Analysis of Physiological System: The White Noise Approach (New York: Plenum, 1978; Russian translation: Moscow, Mir Press, 1981; Chinese translation: Academy of Sciences Press, Beijing, 1990), editor of three research volumes on Advanced Methods of Physiological System Modeling (Plenum, 1987, 1989, 1994) and author of a monograph on Nonlinear Dynamic Modeling of Physiological Systems (IEEE Press & Wiley Interscience, 2004). He has published more than 100 papers and book chapters in the areas of system modeling and signal analysis. His main research interests are in the areas of nonlinear and nonstationary system identification and modeling, with applications to biology and medicine. Other interests include spatiotemporal and multi-input/multioutput modeling of nonlinear systems, with applications to neural information processing, closed-loop system modeling, and high-resolution 3–D ultrasonic imaging and tissue classification.
Prof. Marmarelis is a fellow of the American Institute for Medical and Biological Engineering.