The motivation for the DOOD pseudo-wavelet comes from considering how one might construct a mathematical oscillator detector based on analogy to mechanical oscillators. One first constructs a set of mathematical harmonic oscillators each of which has its own natural frequency of oscillation. Next, consider the time series data as a driving force that acts upon each mathematical harmonic oscillator. Start with the mathematical oscillators all at rest, and then let the data drive activity in the mathematical oscillators. If there is an oscillation in the data (a “data oscillator”) that is nearly resonant with one of the mathematical oscillators, then that mathematical oscillator will steadily gain in energy. That mathematical oscillator will continue to gain in energy as long as the data oscillator remains on. If the data oscillator is turned off, the energy of the mathematical oscillator will plateau. If the data oscillator is turned back on again, the mathematical oscillator will either gain or lose energy, depending on whether the data oscillator is in-phase or out-of-phase with the mathematical oscillator. Sudden increases or decreases in energy are both markers of a resonant oscillation in the data. Therefore, we will choose to monitor the time rate of change of the energy of each mathematical oscillator, i.e., its power.

Time resolution can be improved in this scheme by allowing friction to act on the mathematical oscillators. Friction induces the mathematical oscillators to approach a steady state. The steady state will be different depending on whether or not there is resonant driving by data oscillators. The higher the friction, the faster the mathematical oscillator approaches steady state, and the better the time resolution of when data oscillators turn on and off. The cost of better time resolution, of course, is poorer frequency resolution. This counterbalance between time and frequency resolution is a manifestation of the uncertainty principle.

To begin, let

*x*_{data} (

*t*) represent the voltage recorded by a given electrode at time

*t*, and let

_{data} (

*t*) denote its first time derivative. For ease of notation, we will only consider time series data from one electrode, but generalization to many electrodes is straightforward. Let

*h*(

*t*) denote either

*x*_{data} (

*t*) or

_{data} (

*t*). If we use

*h*(

*t*) =

*x*_{data} (

*t*), we will refer to this algorithm as X-DOOD. If we use

*h*(

*t*) =

_{data} (

*t*), we will refer to this algorithm as V-DOOD. Here “X” is meant to suggest a coordinate, and “V” suggests a velocity.

Next let

*x*(

*n*,

*t*) represent the coordinate of the

*n*^{th} mathematical oscillator,

(

*n*,

*t*) its velocity,

*M*(

*n*) its mass,

*g*(

*n*) its friction constant, and

*ω*_{0}(

*n*) its radial frequency, with

*n* = 1 to

*N*_{f}, where

*N*_{f} is the total number of mathematical oscillators. For convenience, we take all the masses to be equal to one,

*M*(

*n*)=1 for all

*n*. If we took the mass to be anything other than 1, then the net effect would be to scale

*h*(

*n*) by the inverse of the mass, which in turn would be equivalent to changing the units of measurement for

*h*(

*n*). Since we are interested in the relative response of one mathematical oscillator vs another mathematical oscillator to the same data driving force

*h*(

*n*), the masses of the mathematical oscillators do not matter as long as they are all the same. The equation of motion for the

*n*^{th} mathematical oscillator is then

The simplest non-transient solution for

Eq (1) can be written in terms of a complex wavefunction

*ψ*(

*n*,

*t*):

with

. Taking

*ψ*_{R}(

*n*,

*t*) and

*ψ*_{I}(

*n*,

*t*) to be the real and imaginary parts of

*ψ*(

*n*,

*t*), we have

This solution can be verified by taking the first time derivative of

Eq (2) and expressing

*ψ*(

*n*,

*t*) in terms of

*ψ*_{R} (

*n*,

*t*) and

*ψ*_{I} (

*n*,

*t*). The real part of this equation should then read:

_{R} (

*n*,

*t*) =

*h*(

*t*) −

*g*(

*n*)

*ψ*_{R} (

*n*,

*t*) −

*ω*(

*n*)

*ψ*_{I} (

*n*,

*t*). Using

Eqs (3) and

(4) to eliminate

*ψ*_{R} (

*n*,

*t*) and

*ψ*_{I} (

*n*,

*t*) then yields

Eq (1).

The wavefunction

*ψ*(

*n*,

*t*) can be calculated easily using a recursion relationship. To derive this relationship, take one incremental time step

*δt*:

For very small time steps

*δt*,

Eq (5) can be rewritten

Equation (6) allows us to calculate

*ψ*(

*n*,

*t*+

*δt*) rapidly given

*h*(

*t*+

*δt*) and

*ψ*(

*n*,

*t*) without ever having to recalculate

*ψ*(

*n*,

*t*) or to keep track of

*h*(

*t*) from prior time steps. Note that the exponential factor only has to be calculated once, before taking the first time step. Once we know

*ψ*(

*n*,

*t*+

*δt*), we can calculate

*x*(

*n*,

*t*+

*δt*) and

(

*n*,

*t*+

*δt*) using

Eqs (3) and

(4).

The instantaneous energy of the

*n*^{th} mathematical oscillator has terms that involve the friction and terms that are independent of the friction. Since we are interested only in changes in the energy due to coupling to

*h*(

*t*), we shall drop terms that depend on the friction wherever they arise. The instantaneous energy of the

*n*^{th} frictionless harmonic oscillator is given by:

The first term is the kinetic energy term, and the second term is the potential energy term. Taking the time derivative of

Eq (7) and inserting

Eq (1), we find the time rate of change of the instantaneous energy, i.e., the instantaneous power:

The term in

Eq (8) containing

*g*(

*n*) is a term due to friction; we shall ignore it. The term in

*h*(

*t*) is the power due to the data driving force. We take this term to be our measure of spectral density. To emphasize that this measure is that part of the total power due only to the data driving force, we refer to it as the

*data power*. The data power is then written as:

In comparison, the more traditional wavelet and pseudo-wavelet approaches identify the total energy of the wavefunction

*ψ*(

*n*,

*t*) as the spectral density, which is given as:

In the absence of friction, the total energy

*E*(

*n*,

*t*) plateaus when there are no active data oscillators. In the presence of friction, the total energy will decay back to baseline when data oscillators are turned off with a rate given by the friction constant

*g*(

*n*). One might also imagine looking at the total power, given by

*Ė*(

*n*,

*t*). This quantity drops back to baseline faster than the total energy, but it contains a term that depends only on the friction and not on the data oscillator, i.e., the second term in

Eq (8). Therefore, neither

*E*(

*n*,

*t*) nor

*Ė*(

*n*,

*t*) is as sensitive to state changes in the data oscillator as is

*S*(

*n*,

*t*). In the first example below, we will see that

*S*(

*n*,

*t*) can show a sudden drop almost immediately after a data oscillator is turned off, within one period of the resonant frequency. Although

*S*(

*n*,

*t*) is still subject to the uncertainty principle, time resolution is typically much better with

*S*(

*n*,

*t*) than with

*E*(

*n*,

*t*).

In summary, the DOOD algorithm consists of the following steps. First, start with all the mathematical oscillators at rest,

*ψ*(

*n*,

*t*)= 0 at

*t*= 0, with the frequency of each mathematical oscillator being

*f*(

*n*)=

*ω*(

*n*)/2

*π*. Second, use

Eq (6) to calculate

*ψ*(

*n*,

*t*) at the next instant in time. Third, calculate

(

*n*,

*t*) using

Eq (4). Fourth, calculate the data power spectral density using

Eq (9). Fifth, one may not wish to save the data power spectral density at every instant in time. For instance, one may wish to know the data power spectral density only at time intervals given by

*t*_{dump}. In this case, one can average

*S*(

*n*,

*t*) over consecutive time intervals, each of duration

*t*_{dump}, and then save only these averages for graphing.

2.1 Friction and the uncertainty principle

The friction constant

*g*(

*n*) can be used to sharpen the time resolution of the data power spectral density

*S*(

*n*,

*t*). If a data oscillator turns on, drives up the spectral power to a new level, and then turns off, the friction will bring the data power spectral density back down to baseline on a time scale of 1/

*g*(

*n*). Larger friction will bring down the spectral power faster, thus giving better time resolution as to when the data oscillator turned off. Larger friction will also cause broadening of the spectral peak, however. This manifestation of the uncertainty principle is expected and unavoidable. It can be demonstrated by inspecting

Eq (2). If one inserts a pure sinusoid for

*h*(

*t*) in

Eq (2), one finds a Lorentzian line shape with a peak at the frequency of the sinusoid of

*h*(

*t*) and a half-width-at-half-maximum given by

*g*(

*n*). The larger is

*g*(

*n*), the wider is this Lorentzian peak. That is, the spectral bandwidth of the

*n*^{th} mathematical oscillator is

*g*(

*n*). Larger values of

*g*(

*n*) gives better time resolution but a broader spectral bandwidth for the

*n*^{th} mathematical oscillator.

What is the best way to choose the g(n)’s? There is no single right answer to this question. In choosing the *g*(*n*)’s, we have to decide whether we wish to have better time resolution or better frequency resolution. One way to choose the *g*(*n*)’s is to consider how one distributes the frequencies of the mathematical oscillators, i.e., how one chooses the *f*(*n*)’s. If one chooses the *f*(*n*)’s on a conventional linear grid, with *f*(*n* +1)= *f*(*n*)+ *δf*, then it is most reasonable to choose *g*(*n*)/2*π*=*δ f*, so that the spectral bandwidths of neighboring mathematical oscillators overlap, leaving no gap between. Here the absolute frequency resolution is given by *δ f*, while the *relative* frequency resolution is given by *g*(*n*)/*ω*(*n*)=*δ f*/*f*(*n*). Note that in this case, the relative frequency resolution goes like ~1/*n*, i.e., the relative spectral bandwidth of each mathematical oscillator *decreases* for higher frequency oscillators.

One might reasonably argue that, rather than setting the absolute frequency resolution constant for every mathematical oscillator, one should set the relative frequency resolution constant, e.g., by taking [*f*(*n* +1) − *f*(*n*)]/*f*(*n*)= *g*_{0}, where *g*_{0} is a small positive number, the same for every mathematical oscillator. Such a choice then implies that the frequency grid is *geometric*, *f*(*n* +1) = *f*(*n*)[1+ *g*_{0}].

When the frequency range that needs to be spanned for a given problem is small, say, of 1 to 2 orders of magnitude, then it does not matter much whether one chooses a linear or geometric frequency grid. However, when the frequency range that needs to be spanned is 3–4 orders of magnitude, or even larger, then computational economy suggests that one should adopt a geometric frequency grid. In this way, one can span the greatest frequency range with the smallest number of mathematical oscillators, while yet maintaining the same relative spectral resolution of each mathematical oscillator.

2.2 Computational cost

Here we consider computational cost and compare with the cost of traditional time-windowed fast Fourier transform (FFT). Let

*N*_{tot} be the total number of time steps in the entire time series, and let

*N*_{win} be the number of time steps in a given FFT time window. The number of FFT time windows is then

*N*_{S}=

*N*_{tot}/

*N*_{win}. In FFT,

*N*_{win} and

*N*_{f} are constrained to be related:

*N*_{win}= 2

*N*_{f}. If one constructs

*N*_{S} spectra at equal non-overlapping time intervals, then the computational cost of FFT goes like:

The DOOD algorithm

Eq (6) yields a power spectrum at every single time step. Its cost therefore goes like

Comparing

Eqs (11) and

(12), one sees that FFT is much faster than DOOD for large

*N*_{f}. Of course, it is not entirely fair to compare

*C*(

*FFT*) and

*C*(

*DOOD*) in

Eqs (11) and

(12) directly, because the DOOD algorithm yields a spectrum at every single time step, while the FFT algorithm only yields one spectrum for every

*N*_{win} time steps. A fairer comparison might be to use overlapping FFT time windows, such that each time window advances by only one time point. In this case, the FFT computational cost goes like:

Comparing

Eq (12) to

Eq (11′), one sees that the DOOD algorithm is computationally

*less expensive* than FFT, when time resolution at the level of the sampling rate is desired.

Another situation where the DOOD algorithm will be faster than FFT is where one wishes to monitor a single well defined frequency *f*_{data} with spectral width *δ f*_{data} such that *δ f*_{data}/*f*_{data} ≤ 1. In this case, one can choose a single mathematical oscillator with frequency *f*(1) = *f*_{data} and friction *g*(1)/2*π*=*δ f*_{data}. The computational cost of the DOOD algorithm then goes like *N*_{tot}. In comparison, using FFT, if one wishes a spectral resolution of *δ f*_{data}/*f*_{data} ≤ 1, then one needs to monitor at least *N*_{f} points in the frequency domain where *N*_{f} ~ *f*_{data}/*δf*_{data} ≥ 1. In this case, the computational cost of FFT goes like *N*_{tot} N_{f} log(*N*_{f} ), which is much greater than that for the DOOD algorithm.

It is also possible to implement the DOOD algorithm using

Eq (2) directly. In this approach, one segments the total data time series into time windows, as one would for conventional FFT. Within each time window, one can calculate

Eq (2) rapidly by taking the FFT of both sides of

Eq (2), evaluating

Eq (2) in Fourier space, then back transforming to recover

Eq (2) in the time domain. The computational cost in this case is of the same order as that of straight FFT. In practice, we find DOOD fast enough using

Eq (6) that we do not resort to this hybrid DOOD-FFT algorithm. It is conceivable that for situations where time resolution is not a high priority, that one may adopt a hybrid DOOD-FFT approach.

2.3 Nyquist theorem

The sampling rate *f*_{S} of the raw data places an upper limit on the highest frequency oscillator that can be calculated reliably, because if there is an oscillation in the data that is of the same frequency or higher, then it will be artifactually “folded” into the dynamics of oscillators of somewhat lower frequency. This well-known phenomenon applies equally to FFT and to the DOOD pseudo-wavelet. It is advisable to take mathematical oscillators of frequency no higher than *f*_{S}/2 for this reason.

2.4 Comparison with the wavelet transform and derivation of formal inverse

A wavelet transform has the form:

Here

_{w} (

*n*,

*t*) is the wavelet kernel. It has a peak amplitude at frequency

*f*(

*n*) with a certain scaled width. The asterisk denotes a complex conjugate. A defining feature of wavelet transforms is that each wavelet kernel

_{w} (

*n*,

*t*) must satisfy the admissibility criterion, which states:

Here

_{w}(

*n*,

*ω*) is the Fourier transform of

_{w}(

*n*,

*t*). A consequence of the admissibility criterion is that the average of the wavelet kernel over all time must equal zero. A wavelet kernel that satisfies the admissibility criterion results in a wavelet transform that can be inverted to recover

*h*(

*t*) from

*ψ*_{w} (

*n*,

*t*) through use of a general inversion formula that guarantees inversion without loss of information (

Farge 1992;

Cohen and Kovacevic 1996). This property is very useful in applications for information transmission: one can code information in terms of wavelets, compress the code by throwing out those wavelets that are of very low amplitude, transmit the remaining significant wavelets, and then reconstruct the original signal by use of the inversion formula.

However, violation of the admissibility criterion does not necessarily preclude an inverse; it merely means that the general inversion formula discovered for admissible wavelets is not applicable. We will come back to this point shortly.

Comparing

Eqs (14) with

(2), one sees that the DOOD transform is reminiscent of a wavelet transform, if one identifies the pseudo-wavelet kernel as

Because

_{pw} (

*n*,

*t*)= 0 for t < 0, this pseudo-wavelet kernel is said to be

*causal*, because the DOOD transform

*ψ*(

*n*,

*t*) depends only on the current and past values of the signal

*h*(

*t*), and not on any future values of

*h*(

*t*). This property is desirable for time-frequency transforms. However, the time average of

_{pw} (

*n*,

*t*) does not equal zero, and so

_{pw} (

*n*,

*t*) is

*not* an admissible wavelet kernel. This means that the general inversion formula derived for wavelets cannot be used for the DOOD transform.

Nonetheless, for every DOOD pseudo-wavelet *n*, an inverse does exist. For formal completeness, we show how to calculate this inverse. We shall first find *h*(*n*, *t*), which represents the reconstruction of the total signal *h*(*t*) using only the *n*^{th} damped oscillator. The best reconstruction of the total signal will then be *h*(*n*, *t*) averaged over all DOOD pseudo-wavelets.

To proceed, define the one-sided Fourier transform of

*ψ*(

*n*,

*t*) as usual:

Taking the one-sided Fourier transform of both sides of

Eq (2) then yields:

where

(

*n*,

*s*)

_{R} (

*n*,

*s*) +

*i*_{I} (

*n*,

*s*). One next takes the inverse Fourier transform of

(

*n*,

*s*) to obtain

*h*(

*n*,

*t*). If the DOOD kernels

_{pw} (

*n*,

*t*) do not overlap too much, i.e., if

*g*(

*n*) is small compared to the difference between neighboring oscillator frequencies, then the total original signal may be approximated by:

2.5 Definition of instantaneous phase

For every pair of values for

*x*(

*n*,

*t*) and

(

*n*,

*t*), calculated via

Eqs (3)–

(4), we can define an instantaneous phase,

*θ*(

*n*,

*t*)=

*ω*(

*n*)

*t*+

*δ*(

*n*), via:

2.6 Definition of pair correlation functions

Since the spectral density of interest is the data power, we construct a variety of correlation functions from the data power spectral density. The pair correlation function is defined:

If we wish to extract phase information, we have to be careful not to introduce phase shifts through the sign of the data power. This kind of artifact is most easily prevented by squaring the data power before taking the time average:

Equations (24)–

(27) can be used to check whether the phase of oscillator

*n* modifies the data power amplitude of oscillator

*m*. If oscillator

*m* tends to be more activated at a particular phase of oscillator

*n*, then

*σ*_{0}(

*m*,

*n*) will be relatively larger for that pair of (

*m,n*) and

*θ*_{0}(

*m*,

*n*) will give the relative phase of

*n* during which the square magnitude of the data power of

*m* is largest.

2.7 Advantages of X-DOOD vs V-DOOD

Recall that

*x*_{data} (

*t*) represents the voltage recorded by a given electrode at time

*t*, and

_{data} (

*t*) denotes its first time derivative. The time series in

*x*_{data} (

*t*) will be more sensitive to low frequency activity, while that in

_{data} (

*t*) is more sensitive to higher frequency activity. The reason that

_{data} (

*t*) is more sensitive to higher frequency activity is because the instantaneous time derivative of a sinusoid is equal to the frequency of oscillation of the sinusoid times the amplitude of the sinusoid. For example, if

*x*_{data} (

*t*) =

*A* sin(

*ω t*), then

_{data} (

*t*) =

*A ω*cos(

*ω t*). The extra factor of

*ω* in

_{data} (

*t*) magnifies the effect of high frequency oscillations in

_{data} (

*t*) relative to that of low frequency oscillations. Conversely, for very small

*ω*, the extra factor of

*ω* in

_{data} (

*t*) reduces the effect of low frequency oscillations in

_{data} (

*t*) relative to that of high frequency oscillations. Thus we expect V-DOOD to be more sensitive to high frequency oscillations while X-DOOD should be more sensitive to low frequency oscillations.

In general, as we shall see in the examples, one should perform both X-DOOD and V-DOOD when performing time-frequency analysis, in order to explore both low frequency and high frequency structure. However, because V-DOOD requires taking a numerical derivative of the data, there is the potential problem that this numerical derivative may magnify the effects of noise. We shall see that, in practice, the noise magnification introduced by V-DOOD is quite tolerable. In fact, in our examples, V-DOOD does a better job of detecting low frequency oscillations than X-DOOD does of detecting high frequency oscillations, and so V-DOOD may be preferable to X-DOOD as an initial screening tool.

Finally, on the subject of taking time derivatives and magnifying noise, it is noteworthy that the velocity of the

*n*^{th} mathematical oscillator,

(

*n*,

*t*), is calculated directly from

*ψ*(

*n*,

*t*) using

Eq. (4), without having to take a numerical derivative. Since numerical derivatives tend to magnify noise, it is advantageous whenever possible to avoid having to do that.