Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Hum Brain Mapp. Author manuscript; available in PMC 2007 May 10.
Published in final edited form as:
PMCID: PMC1866291

Detection and Quantification of a Wide Range of fMRI Temporal Responses Using a Physiologically-Motivated Basis Set


The temporal dynamics of fMRI responses can span a broad range, indicating a rich underlying physiology, but also posing a significant challenge for detection. For instance, in human auditory cortex, prolonged sound stimuli (~30 sec) can evoke responses ranging from sustained to highly phasic (i.e., characterized by prominent peaks just after sound onset and offset). In the present study, we developed a method capable of detecting a wide variety of responses, while simultaneously extracting information about individual response components, which may have different neurophysiological underpinnings. Specifically, we implemented the general linear model using a novel set of basis functions chosen to reflect temporal features of cortical fMRI responses. This physiologically-motivated basis set (the “OSORU” basis set) was tested against (1) the commonly employed “sustained-only” basis “set” (i.e., a single smoothed “boxcar” function), and (2) a sinusoidal basis set, which is capable of detecting a broad range of responses, but lacks a direct relationship to individual response components. On data that included many different temporal responses, the OSORU basis set performed far better overall than the sustained-only set, and as well or better than the sinusoidal basis set. The OSORU basis set also proved effective in exploring brain physiology. As an example, we demonstrate that the OSORU basis functions can be used to spatially map the relative amount of transient vs. sustained activity within auditory cortex. The OSORU basis set provides a powerful means for response detection and quantification that should be broadly applicable to any brain system and to both human and non-human species.

Keywords: fMRI, general linear model, temporal dynamics, auditory cortex


It is well recognized that sites of brain activation could escape detection with functional magnetic resonance imaging (fMRI) if the temporal dynamics of activation and the a priori temporal assumptions of the detection technique are poorly matched. Numerous approaches have been proposed attempting to minimize the assumptions concerning activation dynamics in order to avoid the possibility of missed activation [e.g., Andersen et al., 1999; Brammer, 1998; Clare et al., 1999; Golay et al., 1998]. However, for the most part, these techniques have remained more of theoretical interest than practical significance because there have not been striking examples of their necessity.

The wide range of temporal fMRI responses recently found in human auditory cortex to prolonged (e.g., 30 sec) stimuli provide a clear illustration of the need for methods capable of detecting an extensive range of response waveshapes in “epoch” (i.e., “block”) related fMRI paradigms [Giraud et al., 2000; Harms and Melcher, 2002; Seifritz et al., 2002]. Depending on the time-envelope of a sound stimulus, the temporal dynamics of auditory cortical activation can vary from the sustained waveshapes seen typically in fMRI to atypical “phasic” waveshapes that include prominent peaks just after sound onset and offset (see, e.g., Fig. 2) [see also Harms, 2002; Harms and Melcher, 2002; Harms et al., 2001]. For instance, 30-sec trains of repeated noise bursts with a low burst repetition rate (e.g., 2/sec) produce highly sustained responses whereas trains with a high rate (e.g., 35/sec) elicit phasic responses in the same cortical areas. The demonstrated capacity of auditory cortex to show these variations in fMRI waveshape raises the possibility of similarly dramatic, but as yet unidentified variations in other cortical areas. It also raises the possibility of additional, as yet undetected modes of response both within and outside the auditory system.

Figure 2
Top three rows: Activation maps obtained using the OSORU, sinusoidal, and sustained-only basis sets, for two different stimuli that elicit sustained (left) or phasic (right) responses. The OSORU and sinusoidal basis sets perform well, regardless of underlying ...

That some forms of activation can be easily missed is clearly illustrated using one of the most statistically powerful, but also the most constrained of detection methods: cross-correlation with an assumed response waveshape. For a commonly used cross-correlating function, a smoothed boxcar, the cross-correlation approach performs well when activation is sustained, but poorly when it is not. This point is well illustrated by the poor detection of phasic responses in auditory cortex, as compared to the good detection of sustained responses (e.g., Fig. 2, bottom activation maps). In cases like auditory cortex, where the extremes of response waveshape are quite different, a detection technique with highly constrained assumptions concerning waveshape is bound to fail at one end of the waveshape spectrum or the other. A method that allows for a wide range of temporal dynamics is, therefore, essential if activation is to be detected reliably. The overall objective of the present study was to develop such a method. We specifically sought an approach that would provide information about the underlying waveshape of activation as an automatic by-product of detection.

Of the various approaches with the potential to detect a range of response waveshapes, we identified the general linear model (GLM) as having, in principle, the characteristics needed to meet our goals, and rejected several other alternatives because they did not satisfy our requirements. In theory, a signal decomposition based on wavelet analysis may support the detection of responses with dynamics that vary over different temporal scales [Brammer, 1998; von Tscharner and Thulborn, 2001]. However, a wavelet transformation does not necessarily facilitate the extraction of directly pertinent, interpretable information concerning the temporal dynamics of activation. Other possible alternatives, such as autocorrelation analysis [Paradis et al., 1996], fuzzy clustering [Baumgartner et al., 1998; Chuang et al., 1999; Fadili et al., 2000; Golay et al., 1998], and principle component analysis [Andersen et al., 1999; Sychra et al., 1994], lack a well-defined statistic that supports inference on a univariate, voxel-by-voxel, basis. The GLM, on the other hand, does not suffer from this limitation and can provide direct information concerning the temporal properties of activation for individual voxels.

The basic idea behind the GLM is that the response can be modeled as a weighted sum of “basis functions.” The amplitudes of the basis functions are estimated so as to give the best overall fit to the measured response. As a by-product of detection under the GLM, the basis functions and their corresponding amplitudes can provide direct information about the underlying temporal responses, provided the basis functions are chosen to relate directly to specific features of the waveshape of activation. Additional strengths of the GLM include: (1) the capability to handle the correlated nature of fMRI time-series, (2) the existence of a well-defined, easily-computed statistic (F-statistic) for estimating significance relative to the null-hypothesis, and (3) good statistical power characteristics [Ardekani and Kanno, 1998].

An important element of the present study was devising a flexible, yet concise, set of basis functions for modeling a range of cortical responses within the GLM framework. The vast majority of previous implementations of the GLM have used a single, “sustained” basis function (equivalent to cross-correlation with a smoothed boxcar or, similarly, a t-test). Some studies have used a two-element basis set, supplementing the standard sustained function with its temporal derivative or a component with an exponential decay [Friston et al., 2000; Giraud et al., 2000]. However, this implementation remains limited in terms of the range of response waveshapes that can be handled. The opposite extreme is to use basis functions that lead to a direct estimate of all time-points of the response (i.e., finite impulse response models), thereby allowing for complete flexibility in possible response dynamics (and truly unbiased response estimation) [Burock and Dale, 2000; Miezin et al., 2000]. However, such an approach will typically be ill advised for epoch-related paradigms with prolonged stimulus presentation, since the direct estimation of many time-points will result in a statistical test with drastically reduced power relative to a test based on a small, well-chosen basis set.

An example of a small, well-chosen basis set is a series of sinusoids (i.e., a truncated Fourier series) [Ardekani et al., 1999; Bullmore et al., 1996; Friston et al., 1995b]. The GLM using this basis set is generally acknowledged to be powerful in terms of detection, and flexible in that it can handle a wide range of temporal responses. In theory, this approach is capable of detecting any response with frequency components in the range of the basis set. A downside, however, is that sinusoidal basis functions do not necessarily have physiological meaning. Thus, while providing a powerful means for detecting a variety of responses, a sinusoidal basis set does not meet our objective of providing direct information about different response components.

The present study took a different approach from previous GLM work in that the choice of basis functions was neurophysiologically motivated. Specifically, the form of each function was chosen to mirror the general shape of a particular component in actual fMRI responses to prolonged (e.g., 30 sec) stimuli, the idea being that certain components may indicate particular aspects of underlying neural activity (e.g., the prominent peaks after stimulus onset and offset in the phasic responses of auditory cortex likely reflect neural activity in response to stimulus onset and offset) [Harms and Melcher, 2002]. Thus, the basis functions, together with their amplitudes, should provide direct, readily interpretable information about the temporal dynamics of responses, and hence the neural activity underlying them. A key difference between the present approach and some previous physiologically-driven detection methods [Purdon et al., 2001] is that our choice of basis functions was oriented toward understanding the neural activity behind fMRI responses, rather than modeling the hemodynamics. Here, five basis functions were chosen. These will be referred to as the “OSORU” basis set, a name that derives from descriptions of the individual components (Onset, Sustained, Offset, Ramp, Undershoot; see Fig. 1).

Figure 1
The five physiologically-motivated functions of the OSORU basis set. The shaded area indicates the period of sound stimulation.

The present study examined the utility of the OSORU basis set in three ways. First, the detection capability of the OSORU basis set in epoch-related paradigms was assessed by testing whether the extent of detected activation was (1) greater than or equal to the extent obtained using one of the most common detection methods, i.e., comparison with a (smoothed) boxcar reference waveform, and (2) comparable to that obtained using a sinusoidal basis set, an alternative basis set generally acknowledged to be powerful, yet flexible, in terms of response detection. Second, the ability of the OSORU basis set to quantify different waveshape components was assessed by comparing the amplitude of different OSORU basis functions (or combinations thereof) with direct measurements (e.g., baseline to peak amplitudes) from the waveforms themselves. Both the detection and quantification tests were conducted using an extensive and challenging database composed of auditory cortical responses to a variety of sounds. Finally, we examined the utility of the OSORU basis set for extracting physiological information concerning responding brain areas. As an example, we derived one particular measure from the OSORU basis functions that summarizes the degree to which auditory cortex responds to sound in a transient vs. sustained manner. We show that the relative amounts of transient and sustained activity can be captured in this measure, and can be spatially mapped across cortical areas.

Overall, the GLM method using the OSORU basis set provided reliable detection and straightforward characterization of the temporal dynamics of auditory cortical activation. Although developed and tested on the human auditory system, this approach should also be applicable, both in concept and detail, to other brain systems and species.


fMRI Data

The data used in this report are pooled across experiments that examined the representation of sound in fMRI response waveshape in the human auditory cortex [Harms, 2002; Harms and Melcher, 2002; Harms et al., 2001].1 The overall database comes from 25 subjects and 39 total imaging sessions. All studies were approved by the institutional committees on the use of human subjects at the Massachusetts Institute of Technology, Massachusetts Eye and Ear Infirmary, and Massachusetts General Hospital, and all subjects gave their written informed consent.

Stimuli were trains of broadband noise bursts with various repetition rates (1, 2, 10, and 35/sec) and duty-cycles (5, 25, 50, and 88%), trains of narrowband noise bursts (third octave bandwidth, center frequency 500 Hz or 4 kHz; rate = 2 or 35/sec), trains of tone bursts (500 Hz or 4 kHz, rate = 2 or 35/sec), continuous broadband noise, trains of clicks (35 or 100/sec), orchestral music, and running speech. This repertoire of stimuli produced a wide range of response waveshapes and thus provided a challenging database for developing and testing the detection and quantification approach of the present study. Three to six stimuli were included in each imaging session, giving a database with a total of 177 cases for which we constructed activation maps and response estimates.

Stimuli were always presented for 30 sec “on” epochs alternated with 30 sec “off” epochs, during which no auditory stimulus was presented. The on/off “period” corresponding to a given stimulus was repeated 4-13 times during an imaging session. The actual functional imaging was organized into individual “runs” consisting of 4-5 on/off periods. Typically, the different stimuli in a session were presented once each run, and their order was varied across runs. However, in 13 sessions the same stimulus was used repeatedly throughout a given run, as was also the case for all cases of the music stimulus. Stimuli were delivered binaurally through a headphone assembly that provided approximately 30 dB of attenuation at the primary frequency of the scanner noise [Ravicz and Melcher, 2001]. In 26 of the sessions, stimulus levels were set to approximately 55 dB above threshold (SL); in the remaining 13 sessions stimulus level was varied between 35 and 75 dB SL.

Imaging was performed on five different systems: 1.5 and 3.0 T General Electric magnets retrofitted for high-speed, echo-planar imaging (by Advanced NMR Systems, Inc.), a 1.5 T General Electric Signa Horizon magnet, and 1.5 and 3.0 T Siemens System Sonata and Allegra magnets. For functional imaging, the selected slice intersected the inferior colliculus and the posterior aspect of Heschl’s gyrus and the superior temporal gyrus (which include auditory cortical areas). A single slice, rather than multiple slices, was imaged to reduce the impact of scanner-generated acoustic noise on auditory activation without sacrificing temporal resolution. Slice thickness was always 7 mm with an in-plane resolution of 3.1 × 3.1 mm. Functional images of the selected slice were acquired using a blood oxygenation level dependent (BOLD) sequence. For the 1.5 T experiments, the sequence parameters were: asymmetric spin echo, TE = 70 msec, τ offset = –25 msec, flip = 90 degrees. For the 3.0 T experiments, the parameters were: gradient echo, TE = 30 msec (except one session used 40 msec and another used 50 msec), flip = 60 degrees or 90 degrees. A T1-weighted anatomical image (in-plane resolution = 1.6 × 1.6 mm, thickness = 7 mm) of the functionally imaged slice was also obtained and used to localize auditory cortex.

The present study focuses on responses from auditory cortex, although the data comes from studies that also examined the inferior colliculus. Therefore, functional images were generally collected using a cardiac gating method that increases the detectability of activation in the inferior colliculus [Guimaraes et al., 1998]. Image acquisitions were synchronized to every other QRS complex in the subject’s electrocardiogram, resulting in an average interimage interval (TR) of approximately 2 sec. Image signal was corrected to account for the image-to-image variations in signal strength (i.e., T1 effects) that result due to fluctuations in heart rate [Guimaraes et al., 1998]. In the 3 sessions that did not use cardiac gating, the TR was 2 sec.

Prior to response estimation using the general linear model, two additional imaging pre-processing steps were performed. First, the images for each scanning run were corrected for any in-plane movements of the head that may have occurred over the course of the imaging session using standard software (SPM95 software package; without spin history correction) [Friston et al., 1995a, 1996]. Then, because cardiac gating results in irregular temporal sampling, the time-series for each imaging run and voxel was linearly interpolated to a consistent 2-sec interval between images, using recorded interimage intervals to reconstruct where each image occurred in time. While it would be relatively straightforward to create basis functions sampled at the actual times at which images were acquired, there is no straightforward way to rigorously model and estimate the noise covariance given irregular temporal sampling. Therefore, we interpolate the time-series for each voxel and treat the data as if it were originally sampled with a regular (2 sec) interval.

Independent of the GLM implementation, we computed empirical response time courses by averaging across repeated presentations of a given stimulus in a given imaging session. Specifically, the time-series for each imaging run and voxel was corrected for linear or quadratic drifts in signal, normalized to an (arbitrary) mean intensity, and temporally smoothed using a three point, zero-phase filter (with coefficients 0.25, 0.5, 0.25). A response “block” was defined as a 70-sec window (35 images) that included 10 sec prior to stimulus onset, the 30 sec coinciding with the stimulus “on” period, and the 30 sec “off” period following the stimulus. These response blocks were averaged according to stimulus to give an average signal vs. time waveform for a given stimulus in a session. For each stimulus and session, we further averaged signal vs. time across “active” voxels (i.e., all voxels in auditory cortex with P < 0.001 according to the OSORU GLM). The resulting “grand-average” waveforms were then converted to percent change in signal relative to baseline. The baseline was defined as the average signal from t = –6 to 0 sec, with time t = 0 sec corresponding to the onset of the stimulus.

Basis Functions

OSORU basis set

The OSORU basis set was composed of five physiologically-motivated components, shown in Figure 1. The design of these components was guided by responses observed in our previous study of the representation of sound repetition rate in the human auditory pathway [Harms and Melcher, 2002]. These responses were used to establish appropriate latencies for the basis functions, and to model the shape of the onset basis function (which was used subsequently to derive the sustained, ramp, and offset basis functions; see below). Importantly, the responses used to guide the basis function formulation represent only a small subset of the total responses in the present database (i.e., ~20 of 177 responses). Therefore, the vast majority of responses serve as an independent test of the OSORU basis set.

The OSORU basis functions include: (1) A transient response to the onset of the stimulus. The time course of this component peaks at 6 sec post-stimulus onset and returns to near baseline by 14-16 sec; (2) A sustained response during the stimulus. This component represents the convolution (normalized to unit amplitude) of the onset component with a boxcar waveform.2 The beginning of the sustained component was designed to overlap the onset component, so that the onset component would primarily reflect a transient response that was above and beyond any sustained response; (3) A slowly changing (approximately linear) response during the stimulus, included because responses in auditory cortex sometimes exhibit a “signal recovery” following the signal decline that characterizes the onset component. This “ramp” component represents the convolution of the onset component with a ramp that increases linearly from t = 8 to 30 sec; (4) A transient “offset” response to stimulus termination, which is a time-shifted version of the onset component. Offset responses occur frequently in our auditory database. Yet, the possibility of offset responses has generally not been factored into fMRI response detection; (5) An additional transient component (with a simple Gaussian waveshape) following the offset component was included primarily to help model responses with a post-stimulus undershoot. Like the other basis functions, this “undershoot” function was defined as a positive deviation from baseline (Fig. 1). It was, therefore, expected to have a negative amplitude for responses with a clear undershoot (i.e., a negative deviation from baseline). Altogether, these five components define the OSORU basis set (Onset, Sustained, Offset, Ramp, Undershoot). The five components of the OSORU set form a concise, yet flexible, basis set that is capable of naturally modeling the prominent features of a variety of response waveshapes.

Sustained-only basis “set”

For one comparison to the OSORU basis set, we used a basis “set” that consisted of just a single component, namely, the sustained component of the OSORU basis set. We included this comparison since methods assuming a sustained response (whether a raw “boxcar” waveform, e.g., a t-test, or a “hemodynamically smoothed” version thereof) have historically played a prominent role in the analysis of fMRI data acquired in epoch-related paradigms. Note that computing activation maps from the sustained-only basis set is not the same as computing maps from just the magnitude of the sustained component in the OSORU basis set. This is because the basis functions in the OSORU set are correlated (correlation coefficients: —0.5 to 0.6). Consequently, the estimated amplitudes for a given component are dependent on precisely which other components are included in the complete model [Andrade et al., 1999; Draper and Smith, 1981].

Sinusoidal basis set

For the second comparison to the OSORU basis set, we examined a sinusoidal basis set that consisted of the 1st through 4th harmonics of a truncated Fourier series (including both sine and cosine terms). The 1st harmonics had a fundamental frequency defined by the “on/off” stimulation period (i.e., 1/60 sec). Previous studies have used just the 1st-3rd harmonics, under the assumption that these harmonics were sufficient for modeling the response space [Ardekani et al., 1999; Bullmore et al., 1996]. Here, we added an additional higher-order harmonic because preliminary work indicated it was needed to ensure optimal detection of phasic responses. In particular, we simulated an artificial data set consisting of a phasic response in Gaussian white noise, and examined the performance of sinusoidal basis sets in which the maximum harmonic varied from one to eight (including both the sine and cosine terms for all harmonics up to and including the maximum). Receiver-operator-characteristic curves [Constable et al., 1995; Skudlarski et al., 1999] showed that the basis set with the 1st-4th harmonics was most powerful in detecting the phasic response. The important contribution of the 4th harmonics is not surprising given that these harmonics together account for about 25% of the variance in a signal with a phasic waveshape, second only to the 2nd harmonics, which together account for 45% of the signal variance. (Because sinusoids are orthogonal, the “explained signal variance” can be uniquely partitioned among the individual components).

Covariates of no interest

In implementing the GLM, for each of the three basis sets (i.e., OSORU, sustained-only, sinusoidal), we included three additional functions as part of the overall basis set for each imaging run. These included a constant term for modeling the baseline signal level of a run, and linear and quadratic functions for modeling any low-frequency signal drift that occurred over the duration of a run. These functions are unrelated to the stimulus-induced response and were, therefore, excluded in the creation of the activation maps (described below). Note that we assumed response waveshape and magnitude were constant across repeated presentations of a given stimulus, so none of the implementations of the GLM incorporated a progressive “habituation” in the amplitude of the basis functions across stimulus repetitions.

Response and Noise Estimation Under the General Linear Model

The GLM was implemented separately for each of the three basis sets: OSORU, sustained-only, and sinusoidal.3 It is well documented that the noise in typical fMRI time-series is autocorrelated (i.e., is not white) [Bullmore et al., 1996; Locascio et al., 1997; Purdon and Weisskoff, 1998; Zarahn et al., 1997], typically resulting in an inflation of the false-positive rate above its theoretically predicted value. Therefore, in implementing the GLM, an estimate of the noise autocorrelation was used to pre-whiten the data.4

Under the general linear model [e.g., Burock and Dale, 2000; Friston et al., 1995c], the time-series y for a single voxel is modeled as a weighted linear sum of basis functions plus a noise term: y=Xβ+η, where the columns of the design matrix X contain the basis functions used to model the response space, β is a vector of basis function amplitudes to be estimated, and η is a noise sequence with arbitrary covariance matrix Ω. If the covariance Ω is known, the generalized least squares (GLS) estimator (which is also the maximum likelihood estimator if the noise is multivariate Gaussian) of β is β^GLS=(XTΩ1X)1XTΩ1y. This estimator is the optimally efficient estimator, meaning that it has the smallest possible variance, among the class of linear, unbiased estimators of β. In our situation, the noise covariance Ω is unknown, but can be estimated (described below). Given such an estimate Ω^, the feasible generalized least squares (FGLS) estimator is obtained by simply replacing Ω with Ω^, i.e., β^FGLS=(XTΩ^1X)1XTΩ^1y. Under fairly general conditions, β^FGLS will have desirable asymptotic properties. Generally, if Ω^ is a consistent estimator of the noise, then β^FGLS will be an asymptotically efficient estimator of β [Fomby et al., 1984]. Depending on the mismatch between the estimated and actual autocorrelations, β^FGLS will have nearly minimum variance.

To estimate the noise covariance, we adopted a two-stage approach [Bullmore et al., 1996; Burock and Dale, 2000; Harms, 2002; Marchini and Smith, 2003; Worsley et al., 2002] in which we first computed the ordinary least squares (OLS) estimate of β. The underlying noise structure was then estimated directly from the empirically determined autocorrelation function [e.g., Wicker and Fonlupt, 2003] of the OLS residual errors (see Appendix). The noise was assumed to be locally constant across auditory cortex to within a scale factor, i.e., we assumed Ω^=σ^2Λ^, where Λ^ is the “normalized” covariance matrix (with ones on the diagonal) that is constant across voxels, and σ^2 is the scalar variance that is estimated separately for each voxel (see Appendix). Since the noise covariance can vary across the brain [Bullmore et al., 1996; Friston et al., 2000], we limited the noise estimation to voxels from auditory cortex, in order to avoid using voxels that were altogether unrelated to the structure of interest. Auditory cortex was defined (from anatomical images) as all cortex lateral to the medial-most aspect of the Sylvian fissure, between the superior temporal sulcus and the inferior edge of the parietal lobe.

Examination of Residuals

To investigate the validity of our approach for estimating the structure of the noise covariance, we examined the residuals from the GLM. In particular, we computed the Ljung-Box diagnostic [Ljung and Box, 1978] for both the “whitened” and OLS residuals (see Appendix). The Ljung-Box diagnostic tests whether the residuals are consistent with white (i.e., uncorrelated) noise, and is given by


where rk is the autocorrelation function computed from the residuals and N is the length of the residual vector. Under the null hypothesis of serially independent (uncorrelated) residuals, QK is distributed as X2 with K degrees of freedom. We choose K = 10, and computed Q10 on a voxel-by-voxel, and run-by-run basis. To summarize the results, we computed, for each run, the percentage of voxels in auditory cortex that failed the Ljung-Box test at a P value of 0.01, and then averaged these percentages across runs to form a single summary measure for each experimental session (PercentLB). Note that the expected value of PercentLB would be 1% if the residuals of voxels in auditory cortex were indeed uncorrelated. Larger percentages indicate that the residuals were more frequently inconsistent with the null hypothesis of white noise, indicating that the nominal p-values computed from β^FGLS are less likely to reflect the true false-positive rate.

Estimating and accounting for the noise covariance resulted in a dramatic decrease in the number of voxels whose residuals failed to exhibit serial independence. Prior to noise estimation (i.e., using the OLS residuals), PercentLB computed under the OSORU basis set, averaged 40% for the sessions conducted with a 1.5 T magnet (n = 27) and 76% for the sessions with a 3 T magnet (n = 12). After noise estimation, PercentLB was reduced to 6 and 19%, respectively. Under the sustained-only and sinusoidal basis sets, PercentLB after noise estimation was within a percentage point of the values obtained for the OROSU basis set. While PercentLB indicates that statistical inference will not be fully accurate in all voxels, we proceed to use all the voxels and their associated P values (under the FGLS approach), given the dramatic improvement in accuracy in a large percentage of voxels.

Activation Map Formation

For each stimulus in a given imaging session, we created an “omnibus” map that tested against the null hypothesis that none of the estimated basis function amplitudes (for a given basis set) were significantly different from zero. To generate these activation maps, statistical inference was drawn from the estimated amplitudes β^FGLS using the generalized hypothesis test. Specifically, we test against linear hypotheses of the form Ho : Rβ = 0, where R is the restriction (or contrast) matrix that defines the linear hypotheses to be tested. Since β^FGLS is asymptotically distributed as a multivariate Gaussian if the noise autocorrelation is estimated consistently, an F-statistic can be used for generalized inference [Burock and Dale, 2000; Fomby et al., 1984). Specifically,


where J is the number of rows in R and Ndof is the total number of time points (for a given voxel) summed across all runs minus the total number of estimated parameters. For calculating the activation map for the kth stimulus in a session, R = [a1 a2 ... aNstim], where ai=k = I (the identity matrix) and ai≠k = 0 (a like-sized matrix of zeros), and Nstim is the total number of stimuli in the session.

We compare the relative performance of the OSORU, sustained-only, and sinusoidal basis sets by comparing the “active” voxels at two P value thresholds (P = 0.001, 0.05 for individual voxels; not corrected for multiple comparisons). For such comparisons to be informative, it is important that the statistical inference be equally valid for the GLM based on each of the three basis sets. We know that the nominal P values do not strictly reflect the true false-positive rate for any of the basis sets. For instance, the sustained-only set will result in biased estimation in cases with phasic responses. Also, the Ljung-Box diagnostic indicated that the noise estimation failed to fully account for the noise autocorrelation in some voxels (albeit a small minority). Importantly however, the Ljung-Box diagnostic did have a similar failure rate under all three basis sets (i.e., similar PercentLB), implying that the P values produced by the three sets are valid to roughly equal degrees. Note that our comparison of activation maps quantifies the ability of the three basis sets to detect a response in the practical situation in which the actual response and the noise is unknown, and consequently a ROC-type analysis (receiver-operator-characteristic) of the true statistical power cannot be performed.

Waveshape Index

Using the OSORU basis functions, we devised a “waveshape index” capable of broadly distinguishing between sustained and phasic responses. The basic idea was to compare the total amount of “transient activity” (defined as the sum of the onset and offset amplitudes) to the activity at the midpoint of the response. Secondarily, the waveshape index was designed to distinguish responses with transient activity that was primarily limited to either the onset or offset component from responses with onset and offset components that approached each other in magnitude. The specific formulation of the waveshape index was chosen so as to yield a robust measure that stayed within a finite range. Reasonable behavior for the waveshape index was confirmed by examining how well the index qualitatively sorted the waveforms of the present study (see Results).

Specifically, the waveshape index was defined as:

12(on+offMid+max(on,off))[set membership][0,1]

where On and Off represent the magnitudes of the onset and offset components of the OSORU basis set (as estimated under the GLM), and Mid is defined as the sum of the magnitude of the sustained component plus one-half of the ramp component.5

Using this definition, maximum values for the waveshape index (i.e., near 1) can only result if the two transient components are similar in magnitude and are large relative to the midpoint response. Values near one-half can reflect a response consisting of solely an onset or offset response, or alternatively a combination of onset and offset activity in a response also having some midpoint activity. Values near zero reflect a response dominated by the midpoint response (i.e., by the sustained and/or ramp components of OSORU basis set).

Eq. (1) was utilized in two different ways in the present study. At the level of individual voxels, Eq. (1) was used to create spatial “maps” of the waveshape index for a given stimulus in a session. We also calculated a summary waveshape index for each stimulus of each session using the magnitudes of the basis functions averaged across all the active voxels (P < 0.001) in auditory cortex. Despite the non-linearity of the waveshape index calculation, the resulting summary waveshape indices were quite similar to a second possible summary measure, defined as the mean of the waveshape indices computed separately for each of the active voxels (R2 = 0.97 in a linear regression comparing the two alternatives). Thus, the summary waveshape index was insensitive to the precise method of summarizing across voxels.


Activation Detection: OSORU vs. Sustained-Only and Sinusoidal Basis Functions

OSORU vs. sustained-only

From the standpoint of activation detection, the OSORU basis set was as effective as the sustained-only set, and was far more effective in many cases. The difference in performance is illustrated qualitatively by the maps in Figure 2, which show activation for two stimuli (2/sec and 35/sec noise burst trains) studied in the same imaging session. The OSORU and sustained-only basis sets detected approximately equal extents of activation for 2/sec noise bursts, which produce sustained response waveshapes (Fig. 2, left). In contrast, for 35/sec noise bursts, the OSORU set detected extensive activation while the sustained-only set gave the impression of little activity in auditory cortex (Fig. 2, right). That this impression from the sustained-only set was erroneous could be seen by inspecting the responses of individual voxels. In response to 35/sec noise bursts, voxels throughout auditory cortex showed signal peaks that were clearly time-locked to sound onset and offset. These “phasic” responses to 35/sec noise bursts were missed by the sustained-only set, whereas they were detected by the OSORU set, hence the greater extent of detected activation for the OSORU basis set.

The trends seen qualitatively in Figure 2 were also borne out in quantitative comparisons of maps based on the OSORU and sustained-only basis sets. For most imaging sessions and stimuli (i.e., 117 of 177 cases), more voxels were defined as “active” (P < 0.001) using the OSORU basis functions. To quantify the nature of the overlap of activated voxels obtained with the two basis sets, we computed the number of voxels that were designated as active in the OSORU-based map, but not the sustained-only map (NOSORU-only; i.e., an “exclusive or” operation), or vice-versa (Nsust-only). This number was then divided by the number of voxels designated as active in both maps (Nboth, an “or” operation, in which voxels active in both maps were counted only once). Across the 117 cases that had more “active” voxels in the OSORU-based map, NOSORU-only/Nboth averaged 0.44, while Nsust-only/Nboth averaged only 0.04. Thus, the voxels detected by the sustained-only set were essentially a subset of the voxels detected by the OSORU set. Moreover, they were an appreciably smaller subset.

Further analysis of voxel overlap showed that even for the cases in which the sustained-only basis set detected an equal or greater number of voxels (60 cases, presumably with primarily sustained underlying responses6), the number of such voxels was only marginally greater than those detected by the OSORU set. Specifically, for these cases, Nsust-only/Nboth averaged only 0.16. Thus, any loss of power in detecting sustained responses under the OSORU set (due to the increased number of basis functions) was, in practice, quite small.

OSORU vs. sinusoidal

The OSORU basis set was as effective as, or better than, the sinusoidal set in terms of activation detection. Their performance is illustrated by the activation maps in Figure 2. For both stimuli (2/sec and 35/sec noise burst trains), the extent of activation with the OSORU basis set (Fig. 2, top row) was comparable to that with the sinusoidal set (Fig. 2, middle row). Notably, both basis sets revealed widespread activation in auditory cortex for both stimuli, despite the substantial difference in underlying response waveform: sustained in one case (2/sec), and phasic in the other (35/sec; Fig. 2, bottom row). Thus, both the OSORU and sinusoidal basis sets have the flexibility to detect the extremes of response waveshape seen in auditory cortex.

Quantitative analysis revealed that more voxels were generally detected with the OSORU basis set than with the sinusoidal set. On average, across all imaging sessions and stimuli (i.e., all 177 cases), 31% of voxels within auditory cortex had P values less than 0.001 for the OSORU set, compared to 25% using the sinusoidal set. In 164 of the 177 cases, more voxels were defined as active using the OSORU basis set. The fraction of voxels detected only with the OSORU set (NOSORU-only/Nboth) was, on average, 0.26, whereas the fraction detected only with the sinusoidal set (Nsins-only/Nboth) averaged 0.03 (the nomenclature and quantification parallels the comparison of the OSORU and sustained-only basis sets above). In other words, the OSORU basis set detected more “active” voxels in the vast majority of cases, and rarely missed voxels detected with the sinusoidal basis set.

For the small minority of cases in which the number of voxels active in the OSORU map was less than or equal to the number in the sinusoidal map (13 of 177), NOSORU-only/Nboth and Nsins-only/Nboth were similar: 0.13 and 0.19. This result indicates that in a small fraction of cases the two basis sets were complementary in that each detected a comparable fraction of voxels missed by the other.

Insensitivity of comparisons to P value

As a check that our comparison of the OSORU, sustained-only, and sinusoidal basis sets was not unduly sensitive to the specific P value criterion chosen, we repeated our quantification of activation map overlap using a liberal P value threshold of 0.05. While more voxels were subsequently designated “active”, the relative performance of the three basis sets, as defined by the percentage overlap of the active voxels, was essentially unaltered.

Relative Importance of the Individual OSORU Basis Functions

To gain a sense of the importance of the individual basis functions in the OSORU basis set relative to each other and the overall response fit, we conducted the following analysis. For each of the 177 cases, F-maps were computed for each of the five basis functions in the OSORU set, based on the estimated amplitude of each function and its associated variance. Note that this is equivalent to computing t-maps and proceeding to work with both the positive and negative tails of the distribution, since F(1,ν) = t2(ν). Then, for the voxels that were “active” in auditory cortex according to the full OSORU basis set (i.e., P < 0.001), we counted the number of voxels that had P < 0.1 in the map computed for a given, single basis function. (A higher P value cutoff was used for the individual functions to avoid an overly restrictive portrayal of their contribution to the overall response fit). In the literature of linear regression, this procedure is sometimes called a partial F-test, and is used in “backward elimination” procedures for determining the “best” set of variables for a regression [Draper and Smith, 1981]. Conceptually, one can think of the partial F-test as treating a given basis function as if it were the last to enter the regression equation, and then testing the null hypothesis that a reduced basis set without the given function accounts (statistically) for as much of the signal variance as the full basis set.7 Rejection of the null hypothesis means that the given basis function significantly improves the response fit, relative to a reduced basis set without this function.

All of the components of the OSORU basis set were important to the response estimation, although with varying degrees of frequency. Of the total of 6,731 “active” voxels as determined by the full OSORU set (across all 177 cases), the individual basis functions were significant in their respective F-maps (at P < 0.1) for the following percentages of voxels: 68, 68, 51, 37, and 25%, for the onset, sustained, offset, ramp, and undershoot functions, respectively. If we had implemented a true backward elimination procedure on the 6,731 active voxels, in which we eliminated the single basis function with the highest (i.e., least significant) P value according to the partial F-test (provided this P value was greater than the 0.1 threshold), the onset function would have been eliminated in 10% of the voxels, the sustained in 12%, the offset in 17%, the ramp in 24%, and the undershoot in 35% of the voxels. (In 3% of the active voxels, all the individual basis functions had P < 0.1, so none would have been eliminated). These results indicate that some basis functions were more useful than others. More importantly, however, the analyses indicate that all five basis functions of the OSORU set were important for fitting the responses in a substantial number of voxels.

Assessment of Correspondence Between OSORU Components and Actual Waveforms

Figure 3 compares the magnitude of three response measures estimated using two methods: one based on the amplitudes estimated for the different OSORU basis functions, and an alternative based on measurements from response waveforms. Figure 3 (left) plots the amplitude of the onset basis function versus an alternative waveform-based measure of “onset” amplitude, defined as the maximum response from t = 4-10 sec minus the mean response between t = 12-14 sec. A linear regression line relating the data was highly significant (y = 1.12x + 0.23; P < 0.001, R2 = 0.87). Similarly, an OSORU-based measure of the response near its midpoint (the amplitude of the sustained basis function plus one-half the amplitude of the ramp basis function) was a good description of the waveform amplitude just after the midpoint of the stimulus “on” epoch (defined as the mean response between t = 18-24 sec; y = 0.97x + 0.04; P < 0.001, R2 = 0.92; Fig. 3, middle). Finally, there was good correlation between the amplitude of the offset basis function and a waveform-based measure of “offset” amplitude, defined as the maximum response in the 4-8 sec following stimulus termination (34 ≤ t ≤ 38 sec) minus the response amplitude at stimulus termination (t = 30 sec; y = 0.93x + 0.26; P < 0.001, R2 = 0.74; Fig. 3, right). Overall, there was a good correspondence between the two methods of amplitude quantification, indicating that the amplitudes of the OSORU basis functions can provide as accurate an assessment of waveshape features as direct measurements from the waveforms themselves.

Figure 3
Comparison of OSORU-based vs. waveform-based measures of three different response amplitudes: onset, midpoint, and offset. Each + represents a value based on all the “active” voxels in auditory cortex (defined as voxels with P < ...

Using the OSORU Basis Functions to Probe Response Physiology

The utility of the OSORU basis set for extracting information concerning the physiology of responding brain areas is illustrated by the following example. Here, we examine the relative amounts of transient and sustained activity in auditory cortex using the waveshape index (defined in Methods), a measure derived directly from the OSORU basis functions. To qualitatively evaluate the effectiveness of this measure in capturing the relative amounts of transient vs. sustained activity, we examined actual response waveforms sorted according to their waveshape index. Figure 4 illustrates the sorting, and simultaneously provides a sense of the overall variety of responses in the database. Across the complete data set, the waveshape index did a good job of sorting the responses, as determined by visual inspection. Responses spanned a wide range, extending from highly sustained (indicating activity throughout the stimulus presentation) to highly phasic (indicating transient activity at sound onset and offset). As a rough approximation, responses with waveshape indices less than ~0.25 were primarily sustained, while responses with indices greater than ~0.6 were predominately phasic. Intermediate waveshape indices (between ~0.25 and ~0.6) indicated a blend of sustained and transient activity.

Figure 4
Response waveforms (solid lines) sorted according to their summary waveshape index. Every sixth waveform of the complete, sorted database (177 waveforms) is displayed. The shaded region indicates the 30-sec stimulus “on” period. Each response ...

Figure 5 illustrates how positional variations in the relative amounts of transient and sustained activity can be explored by spatially mapping the waveshape index on a voxel-by-voxel basis within auditory cortex. The two cases in Figure 5 were selected because they show clear changes in waveshape with position. Specifically, there is an increase in waveshape index from Heschl’s gyrus medially to superior temporal gyrus laterally. This trend indicates that primary auditory cortex (located on Heschl’s gyrus) responded to the sound stimuli in a fairly sustained fashion, whereas more lateral, non-primary areas responded more transiently. This example illustrates one of several ways (see Discussion) that spatial variations in the temporal patterns of activation can be explored using the OSORU basis set.

Figure 5
Two cases that showed clear changes in waveshape with position. The cases correspond to two different subjects and stimuli (top: 35/sec clicks; bottom: continuous noise). Each panel shows a color map of waveshape index superimposed on a grayscale anatomical ...


Successful Response Detection With the OSORU Basis Set

The physiologically-motivated, OSORU basis set successfully detected a variety of response waveforms in auditory cortex, as measured against two alternatives: a sustained-only basis “set,” and a sinusoidal basis set that has been previously used for flexible response estimation under the general linear model [Ardekani et al., 1999]. The OSORU basis set in many instances dramatically outperformed the sustained-only “set,” since the OSORU set is able to fit a variety of underlying response waveforms. Frequently, the sustained-only basis set failed to identify voxels that had a consistent and repeatable, but “non-sustained,” response to a given stimulus, thus providing a misleading picture of brain activation. But importantly, the ability of the OSORU basis set to detect a wide variety of response waveforms did not come at the cost of poor detection of sustained responses; even for sustained responses, the OSORU set detected as many or nearly as many voxels as the sustained-only set.

The fact that the OSORU basis set consistently designated more voxels as “active” than the sinusoidal set suggests that the OSORU set had greater statistical power for detecting the responses of the present database. The superiority of the OSORU basis set is particularly impressive given that several factors in the present study acted to boost the likelihood of response detection under the sinusoidal approach. For instance, we purposely extended the sinusoidal set used previously in other studies to include the 4th harmonics, so as to increase the likelihood of detecting phasic responses. In addition, the actual responses were such that the signal power tended to be concentrated at either the 1st, or the 2nd and 4th harmonics. This concentration of signal power at certain frequencies occurred because (1) responses tended to be either sustained, or biphasic (with onset and offset transients), and (2) the “on” and “off” stimulus epochs were of equal length, thus yielding responses with a certain “symmetry” that increases the power at the harmonics. In contrast, for responses consisting of mainly a single transient (e.g., Figure 4, 4th row, 1st column), or in paradigms with unequal duration “on” and “off” epochs, more of the signal power will be dispersed to higher frequencies, and, consequently, the OSORU basis set would enjoy an even greater advantage over the sinusoidal set.8 Overall, in terms of activation detection, the OSORU basis set performed better than the sinusoidal alternative, which itself was optimized to do well in the current context. Altogether, the comparison of the OSORU basis set with the sinusoidal and sustained-only sets indicates that the OSORU set had the flexibility to model a variety of response waveforms, and yet was concise enough to maintain good statistical power.

A Challenging Database Provided a Strong Test of the OSORU Basis Set

The auditory cortical responses used to test the OSORU basis set were an important element of the present study. The varied temporal dynamics of these data posed a significant challenge for detection. Despite this challenge, the OSORU basis set performed well.

Whether other approaches with the potential for identifying responses with a range of dynamics (e.g., wavelet analysis, fuzzy clustering, or principle component analysis) would also perform well on auditory cortical data is an open question. For the most part, these techniques have been tested on less challenging data, typically sustained responses, either simulated or measured [Baumgartner et al., 1998; Brammer, 1998; Chuang et al., 1999; Fadili et al., 2000; Golay et al., 1998; Sychra et al., 1994; von Tscharner and Thulborn, 2001]. Applying these techniques to a broader range of waveforms, such as those found in auditory cortex, would provide a far stronger test of their detection capabilities.

Detecting and Mapping Response Dynamics

One advantage of the OSORU basis set is that it can facilitate the detection of as yet unidentified responses with forms that might be logically hypothesized based on already identified waveforms. For instance, the sharp decline in signal that forms the phasic response onset component may reflect underlying neural adaptation, and the off component likely represents neural responses to the offset of sound [Harms and Melcher, 2002]. It is possible that neuronal populations exhibiting these responses are not entirely co-localized, or that onset and offset responses do not occur together for every sound (e.g., Fig. 4, 4th row, 1st column). A GLM implemented with the OSORU basis set provides a straightforward way to test these ideas, both qualitatively and statistically. Since the GLM provides estimates of the response components complete with variances, statistical inference can be performed, such as examining whether a given component is significantly different from zero on a voxel-by-voxel basis. This statistical rigor is one advantage of using the OSORU basis set for both detection and response quantification, rather than first detecting responses (e.g., using a “non-physiological model” such as the sinusoidal basis set) and then extracting different measures directly from response waveforms. There is also a certain parsimony, as well as computational efficiency, in obtaining response measures from the same model used to identify responding brain areas.

Using the OSORU basis set, it is a straightforward matter to construct a variety of maps that may reveal spatial variations in response physiology. Mapping the relative amounts of transient and sustained activity using a parameter such as the “waveshape index” (as in Fig. 5) is one such example. Activation maps could also be constructed based on individual OSORU basis functions or linear combinations thereof, e.g., only the sustained component, or the sum (or difference) of the onset and offset components. The correlation between basis functions does not invalidate such an approach, but simply requires that maps constructed from isolated components be interpreted appropriately within the context of the full basis set in which the parameters were originally estimated [Andrade et al., 1999]. In the case of non-linear combinations of components, such as the waveshape index, statistical inference is not automatically supported. However, using the estimated covariance of the basis functions and Monte-Carlo simulation techniques, it would be possible to estimate, for instance, whether two waveshape indices are statistically different from one another. An alternative to mapping different response components would be to visually examine the responses for individual voxels and then attempt a mental synthesis across voxels to extract any spatial trends. However, in cases with a large number of “active” voxels, this approach becomes unwieldy and the shear volume of data to be synthesized makes it difficult to identify any trends with position. In contrast, the approach of extracting and mapping particular OSORU components (or combinations of components) provides a consolidated view of brain responses that can readily reveal positional variations in underlying physiology (e.g., Fig. 5).

In searching for responses with unknown temporal dynamics, there is no single, optimal basis set, and using two or more complementary detection approaches will likely be beneficial. GLMs implemented with the OSORU and sinusoidal basis sets form one such complementary pair. Since the OSORU basis set performed better than the sinusoidal set on the database of the present study, it may also perform better in cases where similar responses are produced. However, the superiority of the OSORU set derives from partially tailoring the basis functions to known, general response features. Therefore, the relative effectiveness of the OSORU and sinusoidal sets might well reverse if the underlying responses differ sufficiently from those represented in the current database. The OSORU basis set, unlike the sinusoidal set, has the advantage that it is easily extended to paradigms that are not strictly periodic (e.g., paradigms in which the durations of the epochs may vary somewhat across stimulus presentations, or may vary between the different stimuli). However, unlike the OSORU basis set, the sinusoidal set offers flexibility for detecting responses with variable temporal delays. Similar “temporal flexibility” could be incorporated into an OSORU-like physiological basis set by allowing variations in the latency of the different functions. However, this would require non-linear, iterative estimation techniques, which are computationally intensive, and one would no longer be able to take advantage of the well-developed theoretical framework for linear models that allows estimation of appropriate statistics given correlated noise. Thus, the OSORU and sinusoidal approaches are complimentary in several respects. So, using both, rather than either alone, will increase the likelihood that all viable response waveforms are detected.

Previous Implementations of the General Linear Model Within a Physiological Framework

Within the context of the general linear model, there appears to be little previous discussion regarding the selection of a set of basis functions that might be advantageous from a physiological perspective. Sobel et al. [2000] confirmed the presence of response habituation over the course of a prolonged odorous stimulus in olfactory cortex using a reference waveform that modeled habituation. However, they did not attempt to define a broader set of more general basis functions that could simultaneously model multiple forms of response. Giraud et al. [2000] observed transient responses in auditory cortex at the transition between amplitude-modulated and continuous noise. These authors handled the situation with an analysis that included three entirely distinct response models (a sustained response model, a transient response model, and mixed model), rather than synthesizing all the possible responses into a single basis set. In short, there have been only a few previous attempts to implement the GLM within any kind of physiological framework, and none, prior to the present study, has attempted a comprehensive basis set capable of accommodating an extensive range of response forms.

Physiologically-Based Implementations of the GLM: Broad Applicability to Any Brain System

While the present study focused specifically on the auditory system, it is important to recognize that the GLM implemented with the OSORU basis set may also be applicable to other brain systems. There are certainly documented cases of non-sustained responses outside the auditory system [Bandettini et al., 1997; Hoge et al., 1999; Nakai et al., 2000; Sobel et al., 2000]. There are also hints that responses ranging from sustained to phasic may occur for other sensory systems. For instance, measurements of time-average signal as a function of increasing repetition rate show a downturn at high rates in the auditory [Harms and Melcher, 2002], somatosensory [Ibanez et al., 1995; Takanashi et al., 2001], and visual systems [Fox and Raichle, 1984, 1985; Kwong et al., 1992; Mentis et al., 1997; Thomas and Menon, 1998; Zhu et al., 1998]. In the auditory system, this decrease occurs because the response changes from sustained to phasic. A similar change in response waveform may explain the decrease in time-average signal in other sensory systems (rather than just a sustained response that decreases in amplitude with increasing rate). Whether this fundamental change in response “mode” indeed occurs in other sensory systems could be tested by detecting and quantifying responses using the OSORU basis set.

The OSORU basis set may not be directly applicable to every task and brain region, but the principle they illustrate, using a physiologically motivated basis set, can nevertheless be applied. For instance, one can imagine completely new physiological basis sets that are tailored to the particular temporal dynamics of a given brain system. Alternatively, the OSORU set might be “tuned” to the particular properties of different brain systems (e.g., by adjusting the timing of the onset and offset basis functions), or additional basis functions might be added to reflect physiological processes not conceptualized in the current framework. Overall, physiologically-motivated basis sets should prove useful in detecting and quantifying responses throughout the brain.


We gratefully thank John Guinan, Anders Dale, Patrick Purdon, Irina Sigalovsky, Monica Hawley, and Rick Hoge for numerous helpful comments and suggestions; Doug Greve for providing the Matlab code that formed the basis of our GLM implementation; and Barbara Norris for considerable assistance in figure preparation. M.P.H. was partially funded by an Athinoula A. Martinos Scholarship.


In actually computingβ^FGLS and Λ^ we utilized the fact that the data were acquired in separate imaging runs, in order to avoid working with unnecessarily large data matrices. Specifically, we computed


where Xr is the design matrix for run r, yr is the fMRI time-series from run r (pre-processed as specified in Methods), and Nr is the number of runs. Using the OLS residual errors for each run er=yXrβ^OLS, we estimated the noise autocorrelation (up to lag k = 15; i.e., a maximum lag of 30 s) as ρk=δk/δ0,where

δk=1Ntotal[center dot]runsAuditory Cortex voxelst=1Trker,ter,t+k.

Ntotal gives the total number of terms in the above triple summation, and er,t is the tth OLS residual from run r (with Tr total time points). Given ρk, we constructed Λ^=Toeplitz(ρk), where the Toeplitz operator returns a matrix that is symmetrical about the leading diagonal. Finally, we estimated


The scalar variance at each voxel was estimated from the whitened residual error er*=Λ^1/2(yrXrβ^FGLS) as

σ^2=1Ndof[center dot]r=1Nrer*Ter*,

where Ndof is the total number of time points (for a given voxel) summed across all runs minus the total number of estimated parameters (i.e., the number of columns in the design matrix X).


1This report examines all of the experiments from these studies that imaged a single slice and used a 30 sec “on,” 30 sec “off” paradigm.

2Technically, the amplitude of the boxcar waveform for the first sample (t = 0 sec) was twice that of the remainder of the 30-sec “on” period, in order to impart a more rapid signal increase to the sustained component.

3For the present study, the GLM was implemented by programs originally developed by Doug Greve and colleagues (at the Athinoula A. Martinos Center for Biomedical Imaging), but that we modified for our purposes. It should be possible to incorporate the OSORU and sinusoidal basis sets within any package that implements the GLM. (The “sustained-only” set or an equivalent is a standard part of GLM packages). Note that some packages (e.g., Statistical Parametric Mapping) automatically orthonormalize their basis sets, which would have to be disabled in order to preserve the physiological meaning of the OSORU functions.

4An alternative would be to “pre-color” the data by replacing the unknown endogenous autocorrelation with an exogenous autocorrelation of known form by applying a smoothing filter. This approach is theoretically less efficient than pre-whitening, but also less prone to invalid statistical inference if the noise is incorrectly estimated [Bullmore et al., 2001; Friston et al., 2000].

5Technically, On, Off, and Mid were all rectified prior to their use in Eq. (1) (i.e., negative values were converted to zero). Consequently, if On, Off, and Mid, were all negative (or zero) prior to rectification, the denominator of Eq. (1) would be zero, and hence the waveshape index would be undefined. In practice, this situation never occurred when the magnitudes of the basis functions were first averaged across active voxels in auditory cortex (i.e., the “summary waveshape index”). Nor did this situation occur for any of the individual voxels active in left auditory cortex for the spatial maps of waveshape index shown in Figure 5.

6The average waveshape index of the 60 cases for which the OSORU basis set detected slightly fewer voxels than the sustained-only set was 0.20, consistent with sustained responses (Fig. 4).

7This “analysis of variance” interpretation of the partial F-test is strictly true only in a framework with independent, uncorrelated errors (i.e., ordinary least squares).

8Note that these comments apply equally well to Fourier analyses that are applied in the frequency domain. Even though a stimulus paradigm may be completely periodic, this does not imply that a Fourier F-test [e.g., Marchini and Ripley, 2000; Purdon and Weisskoff, 1998] will detect all periodic responses with equal statistical power. The important criterion is that the frequencies represented in the response be concentrated at the frequencies used for the F-test. Indeed, a Fourier F-test at the (single) frequency representing the paradigm periodicity is mathematically equivalent to the application of the general linear model using just a single sine and cosine term, an approach that has relatively poor power for detecting phasic or transient (yet periodic) responses.

Contract grant sponsor: NIH/NIDCD; Contract grant numbers: PO1DC00119, T32DC00038; Contract grant sponsor: National Center for Research Resources; Contract grant number: P41RR14075; Contract grant sponsor: Mental Illness and Neuroscience Discovery (MIND) Institute.


  • Andersen AH, Gash DM, Avison MJ. Principal component analysis of the dynamic response measured by fMRI: a generalized linear systems framework. Magn Reson Imag. 1999;17:795–815.
  • Andrade A, Paradis AL, Rouquette S, Poline JB. Ambiguous results in functional neuroimaging data analysis due to covariate correlation. Neuroimage. 1999;10:483–486. [PubMed]
  • Ardekani BA, Kanno I. Statistical methods for detecting activated regions in functional MRI of the brain. Magn Reson Imag. 1998;16:1217–25.
  • Ardekani BA, Kershaw J, Kashikura K, Kanno I. Activation detection in functional MRI using subspace modeling and maximum likelihood estimation. IEEE Trans Med Imag. 1999;18:101–114.
  • Bandettini PA, Kwong KK, Davis TL, Tootell RBH, Wong EC, Fox PT, Belliveau JW, Weisskoff RM, Rosen BR. Characterization of cerebral blood oxygenation and flow changes during prolonged brain activation. Hum Brain Mapp. 1997;5:93–109. [PubMed]
  • Baumgartner R, Windischberger C, Moser E. Quantification in functional magnetic resonance imaging: fuzzy clustering vs. correlation analysis. Magn Reson Imag. 1998;16:115–125.
  • Brammer MJ. Multidimensional wavelet analysis of functional magnetic resonance images. Hum Brain Mapp. 1998;6:378–382. [PubMed]
  • Bullmore E, Brammer M, Williams SCR, Rabe-Hesketh S, Janot N, David A, Mellers J, Howard R, Sham P. Statistical methods of estimation and inference for functional MR image analysis. Magn Reson Med. 1996;35:261–277. [PubMed]
  • Bullmore E, Long C, Suckling J, Fadili J, Calvert G, Zelaya F, Carpenter TA, Brammer M. Colored noise and computional inference in neurophysiological (fMRI) time series analysis: resampling methods in time and wavelet domains. Hum Brain Mapp. 2001;12:61–78. [PubMed]
  • Burock MA, Dale AM. Estimation and detection of event-related fMRI signals with temporally correlated noise: a statistically efficient and unbiased approach. Hum Brain Mapp. 2000;11:249–260. [PubMed]
  • Chuang KH, Chiu MJ, Lin CC, Chen JH. Model-free functional MRI analysis using Kohonen clustering neural network and fuzzy C-means. IEEE Trans Med Imag. 1999;18:1117–1128.
  • Clare S, Humberstone M, Hykin J, Blumhardt LD, Bowtell R, Morris P. Detecting activations in event-related fMRI using analysis of variance. Magn Reson Med. 1999;42:1117–1122. [PubMed]
  • Constable RT, Skudlarski P, Gore JC. An ROC approach for evaluating functional brain MR imaging and postprocessing protocols. Magn Reson Med. 1995;34:57–64. [PubMed]
  • Draper NR, Smith H. Applied regression analysis. John Wiley & Sons; New York: 1981.
  • Fadili MJ, Ruan S, Bloyet D, Mazoyer B. A multistep unsupervised fuzzy clustering analysis of fMRI time series. Hum Brain Mapp. 2000;10:160–178. [PubMed]
  • Fomby TB, Hill RC, Johnson SR. Advanced econometric methods. Springer-Verlag; New York: 1984.
  • Fox PT, Raichle ME. Stimulus rate dependence of regional cerebral blood flow in human striate cortex, demonstrated by positron emission tomography. J Neurophysiol. 1984;51:1109–1120. [PubMed]
  • Fox PT, Raichle ME. Stimulus rate determines regional brain blood flow in striate cortex. Ann Neurol. 1985;17:303–305. [PubMed]
  • Friston KJ, Ashburner J, Frith CD, Poline J-B, Heather JD, Frackowiak RSJ. Spatial registration and normalization of images. Hum Brain Mapp. 1995a;2:165–189.
  • Friston KJ, Frith CD, Frackowiak RSJ, Turner R. Characterizing dynamic brain responses with fMRI: a multivariate approach. Neuroimage. 1995b;2:166–172. [PubMed]
  • Friston KJ, Holmes AP, Worsley KJ, Poline J-P, Frith CD, Frackowiak RSJ. Statistical parametric maps in functional imaging: A general linear approach. Hum Brain Mapp. 1995c;2:189–210.
  • Friston KJ, Williams S, Howard R, Frackowiak RSJ, Turner R. Movement-related effects in fMRI time-series. Magn Reson Med. 1996;35:346–355. [PubMed]
  • Friston KJ, Zarahn E, Holmes AP, Rouquette S, Poline J-B. To smooth or not to smooth? Bias and efficiency in fMRI time-series analysis. Neuroimage. 2000;12:196–208. [PubMed]
  • Giraud AL, Lorenzi C, Ashburner J, Wable J, Johnsrude I, Frackowiak R, Kleinschmidt A. Representation of the temporal envelope of sounds in the human brain. J Neurophysiol. 2000;84:1588–1598. [PubMed]
  • Golay X, Kollias S, Stoll G, Meier D, Valavanis A, Boesiger P. A new correlation-based fuzzy logic clustering algorithm for fMRI. Magn Reson Med. 1998;40:249–260. [PubMed]
  • Guimaraes AR, Melcher JR, Talavage TM, Baker JR, Ledden P, Rosen BR, Kiang NY, Fullerton BC, Weisskoff RM. Imaging subcortical auditory activity in humans. Hum Brain Mapp. 1998;6:33–41. [PMC free article] [PubMed]
  • Harms MP. Sound temporal envelope and time-patterns of activity in the human auditory pathway: an fMRI study. Ph.D. Thesis. Massachusetts Institute of Technology; Cambridge: 2002.
  • Harms MP, Melcher JR. Sound repetition rate in the human auditory pathway: representations in the waveshape and amplitude of fMRI activation. J Neurophysiol. 2002;88:1433–1450. [PubMed]
  • Harms MP, Sigalovsky IS, Guinan JJ, Jr., Melcher JR. Temporal dynamics of fMRI responses in human auditory cortex: dependence on stimulus type. Assoc Res Otolaryngol Abs. 2001;24:185.
  • Hoge RD, Atkinson J, Gill B, Crelier GR, Marrett S, Pike GB. Stimulus-dependent BOLD and perfusion dynamics in human V1. Neuroimage. 1999;9:573–585. [PubMed]
  • Ibanez V, Deiber MP, Sadato N, Toro C, Grissom J, Woods RP, Mazziotta JC, Hallett M. Effects of stimulus rate on regional cerebral blood flow after median nerve stimulation. Brain. 1995;118:1339–1351. [PubMed]
  • Kwong KK, Belliveau JW, Chesler DA, Goldberg IE, Weisskoff RM, Poncelet BP, Kennedy DN, Hoppel BE, Cohen MS, Turner R, et al. Dynamic magnetic resonance imaging of human brain activity during primary sensory stimulation. Proc Natl Acad Sci. 1992;89:5675–5679. [PubMed]
  • Ljung GM, Box GEP. On a measure of lack of fit in time series models. Biometrika. 1978;65:297–303.
  • Locascio JJ, Jennings PJ, Moore CI, Corkin S. Time series analysis in the time domain and resampling methods for studies of functional magnetic resonance brain imaging. Hum Brain Mapp. 1997;5:168–193.
  • Marchini JL, Ripley BD. A new statistical approach to detecting significant activation in functional MRI. Neuroimage. 2000;12:366–380. [PubMed]
  • Marchini JL, Smith SM. On bias in the estimation of autocorrelations for fMRI voxel time-series analysis. Neuroimage. 2003;18:83–90. [PubMed]
  • Mentis MJ, Alexander GE, Grady CL, Horwitz B, Krasuski J, Pietrini P, Strassburger T, Hampel H, Schapiro MB, Rapoport SI. Frequency variation of a pattern-flash visual stimulus during PET differentially activates brain from striate through frontal cortex. Neuroimage. 1997;5:116–128. [PubMed]
  • Miezin FM, Maccotta L, Ollinger JM, Petersen SE, Buckner RL. Characterizing the hemodynamic response: effects of presentation rate, sampling procedure, and the possibility of ordering brain activity based on relative timing. Neuroimage. 2000;11:735–759. [PubMed]
  • Nakai T, Matsuo K, Kato C, Takehara Y, Isoda H, Moriya T, Okada T, Sakahara H. Post-stimulus response in hemodynamics observed by functional magnetic resonance imaging: difference between the primary and sensorimotor area and the supplementary motor area. Magn Reson Imag. 2000;18:1215–1219.
  • Paradis AL, Mangin JF, Bloch I, Cornilleau-Peres V, Moulines E, Frouin V, Le Bihan D. Detection of periodic signals in brain echo-planar functional images; 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; Amsterdam. 1996.1996. pp. 696–697.
  • Purdon PL, Solo V, Weisskoff RM, Brown EN. Locally regularized spatiotemporal modeling and model comparison for functional MRI. Neuroimage. 2001;14:912–923. [PubMed]
  • Purdon PL, Weisskoff RM. Effect of temporal autocorrelation due to physiological noise and stimulus paradigm on voxel-level false-positive rates in fMRI. Hum Brain Mapp. 1998;6:239–249. [PubMed]
  • Ravicz ME, Melcher JR. Isolating the auditory system from acoustic noise during functional magnetic resonance imaging: examination of noise conduction through the ear canal, head, and body. J Acoust Soc Am. 2001;109:216–231. [PMC free article] [PubMed]
  • Seifritz E, Esposito F, Hennel F, Mustovic H, Neuhoff JG, Bilecen D, Tedeschi G, Scheffler K, Di Salle F. Spatiotemporal pattern of neural processing in the human auditory cortex. Science. 2002;297:1706–1708. [PubMed]
  • Skudlarski P, Constable RT, Gore JC. ROC analysis of statistical methods used in functional MRI: Individual subjects. Neuroimage. 1999;9:311–329. [PubMed]
  • Sobel N, Prabhakaran V, Zhao Z, Desmond JE, Glover GH, Sullivan EV, Gabrieli JDE. Time course of odorant-induced activation in the human primary olfactory cortex. J Neurophysiol. 2000;83:537–551. [PubMed]
  • Sychra JJ, Bandettini PA, Bhattacharya N, Lin Q. Synthetic images by subspace transforms. I. Principal components images and related filters. Med Phys. 1994;21:193–201. [PubMed]
  • Takanashi M, Abe K, Yanagihara T, Oshiro Y, Watanabe Y, Tanaka H, Hirabuki N, Nakamura H, Fujita N. Effects of stimulus presentation rate on the activity of primary somatosensory cortex: a functional magnetic resonance imaging study in humans. Brain Res Bull. 2001;54:125–129. [PubMed]
  • Thomas CG, Menon RS. Amplitude response and stimulus presentation frequency response of human primary visual cortex using BOLD EPI at 4 T. Magn Reson Med. 1998;40:203–209. [PubMed]
  • von Tscharner V, Thulborn KR. Specified-resolution wavelet analysis of activation patterns from BOLD contrast fMRI. IEEE Trans Med Imag. 2001;20:704–714.
  • Wicker B, Fonlupt P. Generalized least-squares method applied to fMRI time series with empirically determined correlation matrix. Neuroimage. 2003;18:588–594. [PubMed]
  • Worsley KJ, Liao CH, Aston J, Petre V, Duncan GH, Morales F, Evans AC. A general statistical analysis for fMRI data. Neuroimage. 2002;15:1–15. [PubMed]
  • Zarahn E, Aguirre GK, D’Esposito M. Empirical analyses of BOLD fMRI statistics. I. Spatially unsmoothed data collected under null-hypothesis conditions. Neuroimage. 1997;5:179–197. [PubMed]
  • Zhu XH, Kim SG, Andersen P, Ogawa S, Ugurbil K, Chen W. Simultaneous oxygenation and perfusion imaging study of functional activity in primary visual cortex at different visual stimulation frequency: Quantitative correlation between BOLD and CBF changes. Magn Reson Med. 1998;40:703–711. [PubMed]