|Home | About | Journals | Submit | Contact Us | Français|
Most of the computations and tasks performed by the brain require the ability to tell time, and process and generate temporal patterns. Thus, there is a diverse set of neural mechanisms in place to allow the brain to tell time across a wide range of scales: from interaural delays on the order of microseconds to circadian rhythms and beyond. Temporal processing is most sophisticated on the scale of tens of milliseconds to a few seconds, because it is within this range that the brain must recognize and produce complex temporal patterns—such as those that characterize speech and music. Most models of timing, however, have focused primarily on simple intervals and durations, thus it is not clear whether they will generalize to complex pattern-based temporal tasks. Here, we review neurobiologically based models of timing in the subsecond range, focusing on whether they generalize to tasks that require placing consecutive intervals in the context of an overall pattern, that is, pattern timing.
The dynamic nature of our environment and the need to move, communicate, and anticipate when events will happen, contributed to the evolution of neural mechanisms that allow the brain to tell time. On one extreme, animals detect the microsecond delays it takes sound waves to travel from one side of the head to the other in order to localize sound sources in space . On the other extreme, circadian rhythms allow animals to track day-night cycles in the absence of external cues [2–3]. Between these extremes, humans and other animals also time events on the order seconds to minutes. Humans, for example, anticipate the duration of traffic lights or the time between telephone rings. Similarly some animals track the amount of time between visits to food sources in order to optimize foraging [4–5]. Finally, rodents and other animals can be trained on a diverse range of temporal tasks, such as peak interval procedures in which they learn the interval between a stimulus and reward availability [6–8].
In the above examples, animals primarily need to time isolated intervals or durations, as opposed to complex temporal patterns defined by the relative timing of multiple consecutive intervals. The prosody of speech and the rhythm of music, for example, are not defined by any single interval or duration, but by the global temporal structure of many consecutive intervals. Furthermore, speech and music require timing multiple embedded temporal patterns. For example, voice-onset time (the interval between air release and vocal cord vibration) contributes to phoneme discrimination , the duration of vowels and pauses between words conveys information about phrase boundaries [10–11], and speech rate and contour contribute to prosody and comprehension [12–14]. Thus speech relies on timing over a number of different scales and features in parallel.
Perhaps the clearest example of just how sophisticated our ability to process complex temporal patterns can be is that language is reducible to a purely temporal code. Specifically, when individuals communicate via Morse code, the information is contained in the duration of tones, the interval between them, and their global structure. At the relatively low speed of 10 words-per-minute each dot and dash is 120 and 360 ms long respectively, and the inter-letter and inter-word intervals are 360 and 840 ms. The offset of any tone marks the stop time of a duration and the start time of an interval. This fact helps constrain the possible timing mechanisms underlying Morse code recognition, as any mechanism that requires a significant amount of time to “reset” before timing the next interval, would be unlikely to satisfy the temporal requirements of Morse code.
To distinguish between temporal tasks that require timing isolated intervals from those that require timing multiple consecutive intervals within a global context, we will use the terms interval timing (although we note that this term is commonly used for timing in the range of seconds to minutes ) and pattern timing (Figure 1).
While most psychophysical tasks focus on interval timing, a number of temporal tasks rely on the production or discrimination of intervals embedded within a global pattern. Such tasks include:
To simplify our discussion, we will focus on aperiodic patterns as opposed to periodic tasks in which subjects have to discriminate or reproduce isolated or repetitive intervals . However, there is data suggesting that periodic and aperiodic timing tasks rely on different neural mechanisms [21–22].
It is clear that the brain uses multiple neural mechanisms to tell time across temporal scales. For example the mechanisms underlying sound localization, the ability to tap along with the beat of a song, or generate circadian rhythms are clearly distinct [23–24]. However, it is less clear whether the neural mechanisms underlying interval and pattern timing are the same: does pattern timing rely on the timing of independent intervals, like marking the laps on a stopwatch, or is each interval automatically encoded in the context of a pattern? Here we ask if the same mechanisms that have been proposed to underlie simple forms of timing can also account for the complex temporal tasks such as recognizing and producing letters in Morse code. To answer this question we examine three classes of neurobiologically-based timing models—that is, those that have been implemented at the level of simulated neurons (spiking or firing rate).
One of the simplest models of how time might be represented in networks of neurons is a synfire chain, which is generally composed of a large number of neurons arranged in separate pools connected with a feed-forward architecture (Fig. 2) [25–27]. Activity propagates from one pool to another, such that each pool is activated at different points in time—e.g., pool one is activated at t=0, while neurons in pool 10 might be activated at t=100 ms. Thus it is possible for downstream circuits to read out the elapsed time by detecting which pool is currently active. Similarly, it is possible to produce a timed motor response by connecting the appropriate pool to appropriate output units. In their simplest form, synfire chains implement delay lines, in which each synaptic step inserts an additional delay.
It is easy to see how a synfire chain could be used to detect or produce a 100 ms interval. To produce the interval the neurons activated at 100 ms should be connected to the appropriate output. For detection a readout neuron should only fire when it receives simultaneous input from the 100 ms pool and a direct sensory pathway. Importantly, timing via synfire chains could also underlie pattern timing. For example, at the motor level a synfire chain could potentially be used to generate a complex pattern by connecting each pool of the chain to sequential output units. Indeed, songbird studies have provided evidence that synfire chains could underlie the complex forms of timing necessary for song generation. This complex timing appears to be generated by sequential bursting of a subset of neurons in the songbird sensorimotor nucleus HVC. These HVC neurons fire at specific moments during a song, providing the timing necessary for the structure of each syllable and the sequence of syllables within a song [28–29]. Using in vivo recordings and spiking neural network simulations, Long and colleagues  provided evidence that these firing patterns are consistent with a feedforward synfire chain network architecture (Figure 2).
Numerous additional studies in mammals have revealed synfire chain-like temporal signatures. Specifically, populations of neurons that fire during specific points in time ("temporal receptive fields") reveal chain-like activation patterns when sorted according to firing latency—typically visualized as a diagonal band of activity [8,31–32]. It remains unclear however, whether these patterns are generated locally by the circuits being recorded, and if so, whether they are a result of feedforward synfire architecture. Indeed, while these patterns of activity are certainly suggestive of a feedforward network, a number of computational models have shown that they can emerge from the propagation of activity within recurrent neural networks (see below).
Cortical circuits are characterized by recurrent connectivity between local pyramidal neurons [33–34]. While synfire chains can in principle incorporate recurrent connections, in practice they are typically implemented within purely feed-forward architectures. Consequently, it is highly unlikely that cortical networks are actually feedforward synfire chains. A related and important issue is the capacity of synfire chains. Specifically: how many trajectories of a given temporal length can one feedforward synfire network encode? In one sense the capacity is low. For example in a purely feed-forward network, if we assume that each neuron only fires once and must participate in every pattern, then the capacity is essentially one trajectory (which is not the case in a recurrent network given these same assumption ). However, if we assume that different subpopulations of neurons within a pool fire during different trajectories, then the capacity increases significantly .
Overall, synfire chains offer a potentially general, and biologically plausible, mechanism to account for both interval and pattern timing. However, the traditional focus on feedforward synfire chains is probably unrealistic because of the absence of recurrency and their limited capacity.
Other neurobiologically-based models of timing explicitly rely on positive feedback through recurrent excitatory connections. One such model was developed to account for a series of experimental observations of Shuler and colleagues who reported that V1 neurons can encode the interval between stimulus onset and a reward [37–39]. In the basic task, rats are exposed to a visual flash which predicts the arrival of a water reward after a delay Δt (more specifically, the reward was available after a fixed number of licks, which correlates with time). In vivo single unit recordings from V1 revealed that a subpopulation of neurons encode the reward interval—for example, some neurons maintained a relatively high firing-rate during the delay period.
To account to this experimental data, Shouval and colleagues developed a spike-based recurrent neural network model and described how local cortical circuits might encode the reward interval [39–42]. Their hypothesis is that the observed prolonged activity is generated through recurrent excitatory connections. This approach has parallels in models of short-term memory that have used well-tuned positive feedback to maintain a fixed level of activity [43–45]. We also note that there is a significant experimental literature reporting that neurons can exhibit ramping firing rates during well-trained temporal tasks [46–48]; and that some computational models of ramping neurons also incorporate network-level positive feedback .
In Shouval’s model, positive feedback is being used to, in effect, generate a long network time constant. The authors find that potentiating the recurrent connections through a self-organizing Hebbian-like plasticity rule can extend network-wide firing elicited by a specific cue. Recently it has been shown that the synaptic tuning can be achieved using an experimentally derived form of associative learning that takes into account the fact that the reinforcement signal is delayed in relation to the activity patterns that trigger LTP or LTD . Importantly, the recurrent positive feedback does not maintain the activity at some fixed-point, as in working memory models. Rather, low-levels of positive feedback are sufficient to extend the amount of time that the network is active, effectively controlling the network-level decay time . In this manner the mean duration of the evoked firing can represent reward onset time. Significantly, these firing patterns are a result of the dynamics within the local cortical network: they do not require tonic external input nor dedicated timing cells, supporting the theory that temporal processing is a general and intrinsic property of recurrent neural networks [50–52].
While this positive feedback model elegantly accounts for the experimental data on a form of interval timing, it is unclear if it would extend to pattern timing. Specifically, positive feedback mechanisms seem unlikely to be able to time consecutive intervals because the network would have to be rapidly reset at the end of each interval, which also marks the beginning of the next. Thus, the experimental and computational results of Shuler and Shouval further suggest the presence of distinct mechanisms for interval and pattern timing.
One of the first neurobiologically-based models of timing and temporal processing proposed that networks of neurons are intrinsically able to tell and encode time as a result of dynamic changes in the state of neural networks [53–55]. Specifically, this model states that the evolving neural population activity is a code that represents time.
At the sensory level, the hypothesis is that the discrimination of temporal intervals arises from the interaction between the internal-state of a network and incoming stimuli. In this sensory mode, the recurrent weights of these networks are generally fairly weak—that is, not capable of sustaining self-perpetuating activity. Thus, much of the temporal information emerges from neural and synaptic properties that are naturally time-varying (the so-called hidden states—e.g. short-term synaptic plasticity). Such models have been shown to effectively discriminate not only simple intervals, but complex temporal patterns as well [51,56–59]. The hypothesis is that each sensory event interacts with the current state of the network, forming a pattern of network states that naturally encodes each event in the context of the recent stimulus history—much as the ripples generated by each raindrop falling in a pond will interact with the ripples created by previous raindrops. Experimental studies have supported this hypothesis by demonstrating that cortical networks contain information about not only the current stimulus, but also the interval and order of recent events [60–64].
The same general framework has also been applied to timing in the motor domain [55,65–67]. In contrast to sensory timing, motor timing relies on the active production of a response at the appropriate interval after a start cue. Therefore, in the motor regime, the recurrent connections need to be relatively strong, i.e. capable of self-perpetuating activity. In state-dependent models of motor behavior, time is encoded in the dynamically changing patterns of active neurons, forming a population clock . The activity in the network traces out a trajectory in neural state space, in which each point in time corresponds to a unique population of active neurons. These patterns can be sparse: a few neurons activated at any point in time and each neuron activated at only one point, as in a synfire chain; or “dense”: with many neurons activated at a time, and each neuron potentially active at different points in the same trajectory (we can think of these as “high-entropy” trajectories). Experimental studies have reported numerous examples of either sparse functional feed-forward patterns of activity [8,28,30,69–70], or complex high-entropy patterns [71–74] of activity that encode time. A recent experimental and computational study also provided support for the notion that time is represented in high-dimensional trajectories . In this work, recordings from over 100 neurons in the premotor cortex revealed a neural trajectory that evolved over a period of seconds during a task in which monkeys expected a reward between 1.5 and 3.5 seconds after the start cue. Analysis suggested that the reward window was represented in a trajectory segment, and that temporal expectation was intrinsically represented because this segment was the closest to a boundary that, if crossed, triggered a motor response.
Fig. 3 provides an example of pattern timing in a population clock model implemented in a simulated recurrent neural network (RNN) based on firing rate units. The network starts in a high-gain regime which generates a high-dimensional trajectory in response to a brief input. The network is then trained to reproduce this “innate” trajectory, by adjusting the weights of the recurrent network . As a result of this training, the trajectory becomes locally stable (a “dynamic attractor”). Because this trajectory is stable in high-dimensional space, the output unit can then be trained to produce an arbitrarily complex temporal pattern, in this case the Morse code spelling of “Hello.” Here it should be stressed that the learning rule used to adjust the recurrent weights is not biologically plausible.
Synfire chain and positive feedback models can certainly be applied to pattern timing, but we suggest that state-dependent network models are better suited for pattern timing because they are inherently high dimensionsal. Consider that six different isolated intervals, when arranged into a sequence of four (that is, a pattern composed of four intervals) can produce a total of 1296 potential patterns. A single state-dependent network is well suited to learn any arbitrary set of these patterns. Thus, state-dependent networks, and related reservoir computing models [76–78] represent general computational frameworks capabable not only of interval and pattern timing, but also spatial and temporal computations.
The great majority of experimental and theoretical work on timing in the subsecond range has focused on isolated intervals and durations, i.e. interval timing. Here we stress that within this time scale the brain also performs a wide range of temporal tasks that require processing consecutive intervals and placing these in a temporal context—speech, music, and Morse code being clear examples of such pattern timing.
In addition to the distinction between interval and pattern timing, there are other temporal features that still must be carefully addressed in computational models. Of particular relevance is temporal scaling: how do we produce or recognize the same global temporal pattern at different speeds? A pianist can, for example, play the same piece of music at a range of different musical tempos. Though temporal scaling is a robust phenomenon, the underlying neurobiological mechanisms are not known, and indeed temporal scaling has not been reported in any of the three classes of biologically plausible models discussed above. A few experimental studies suggest that neural trajectories encode relative time. That is, when animals time intervals of different lengths within the same overall pattern, it appears that the same neural trajectory may be replayed at different speeds [8,71,79].
Given the diverse range of temporal tasks the brain performs, together with the large number of brain areas that have been implicated in timing in the range of tens of milliseconds to a few seconds, we argue that the brain does not have a singular timing mechanism. Rather, the brain has a number of different timing mechanisms, each used to solve specific temporal tasks. In some cases, for example, the brain may use specialized mechanisms for interval timing that are not capable of pattern timing. Potential examples include positive feedback mechanisms and the ramping of neuronal firing rates. But in other instances—for example the discrimination of simple and complex auditory patterns—we propose that the same neural mechanisms can underlie both interval and pattern timing.
The authors are supported by the NIH grants MH60163 and T32 NS058280, and NSF grant IIS-1420897. We thank Martina DeSalvo for comments on this manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.