|Home | About | Journals | Submit | Contact Us | Français|
When people make decisions quickly, accuracy suffers. Traditionally, speed-accuracy tradeoffs (SAT) have been almost exclusively ascribed to changes in the amount of sensory evidence required to support a response (response caution) and the neural correlates associated with the later stages of decision making (e.g., motor response generation and execution). Here, we investigated whether performance decrements under speed pressure also reflect suboptimal information processing in early sensory areas such as primary visual cortex (V1). Human subjects performed an orientation discrimination task while emphasizing either response speed or accuracy. A model of choice behavior revealed that the rate of sensory evidence accumulation was selectively modulated when subjects emphasized accuracy, but not speed, suggesting that changes in sensory processing also influence the SAT. We then used functional MRI (fMRI) and a forward encoding model to derive orientation-selective tuning functions based on activation patterns in V1. When accuracy was emphasized, the extent to which orientation-selective tuning profiles exhibited a theoretically optimal gain pattern predicted both response accuracy and the rate of sensory evidence accumulation. However, these relationships were not observed when subjects emphasized speed. Collectively, our findings suggest that, in addition to lowered response thresholds, the performance decrements observed during speeded decision making may result from a failure to optimally process sensory signals.
Fast decisions are typically more error prone, while precise decisions require more time, a phenomenon known as the speed-accuracy tradeoff (or SAT; Woodworth, 1899; Fitts, 1966; Wickelgren, 1977; Dickman and Meyer, 1988). Traditional models of the SAT hold that fast but premature responses occur when not enough sensory information has been accumulated to support an accurate judgment (i.e., response thresholds are too low; Bogacz et al., 2006; Ratcliff and McKoon, 2008). On this response threshold account, the SAT is mediated by neural mechanisms of late-stage decision processes that immediately precede the initiation of motor responses (Van Veen et al., 2008; Forstmann et al., 2008; Forstmann et al., 2010; Bogacz et al., 2010). A complementary – and largely untested – hypothesis is that speed pressure also influences the efficiency with which sensory evidence is accumulated during decision making (sensory-readout hypothesis). This is an important possibility given that the rate of sensory evidence accumulation necessarily limits the efficacy of downstream decision-making and motor control processes.
To investigate the influence of the SAT on sensory processing, we designed a perceptual decision making task that required human observers to discriminate between two orientated grating stimuli (see Figure 1 and Materials and Methods) under either speed emphasis (SE) or accuracy emphasis (AE) conditions. Importantly, subjects had to discriminate a small rotational offset (5°) between the gratings. Previous psychophysical and neurophysiological studies have shown that the most informative neurons for supporting such fine discriminations are tuned away from the target feature (hereupon termed off-target neurons; see Figure 2; Hol and Treue, 2001; Butts and Goldman, 2006; Jazayeri and Movshon, 2006; Navalpakkam and Itti, 2007; Scolari and Serences, 2009; Moore, 2008; Purushothaman and Bradley, 2005; Regan and Beverley, 1985; Schoups et al., 2001). This theoretical framework thereby provides a benchmark pattern of optimal sensory gain against which we can compare gain observed under different SAT conditions.
We investigated how the SAT influenced information processing by fitting response time (RT) and accuracy data using two models of choice behavior: the Linear Ballistic Accumulator model (LBA; Brown and Heathcote, 2008; see Figure 3) and an extension of the LBA, the Single Trial Linear Ballistic Accumulator (STLBA; Van Maanen et al., 2011). These models revealed an impact of task instruction on the amount of information required to initiate a decision (response caution) and on the rate of sensory evidence accumulation (the drift rate); the later effect suggests that the SAT may affect sensory processing (see also Vandekerckhove et al., 2011 and Hübner et al., 2010). We then used a forward encoding model (Brouwer and Heeger, 2009; 2011; reviewed in Naselaris et al., 2011; Serences and Saproo, 2011) to examine how feature-selective BOLD response profiles in primary visual cortex (V1) are associated with behavioral performance and with the rate of sensory evidence accumulation under different SAT conditions. Our results suggest that theoretically optimal response patterns in V1 are associated with more efficient sensory evidence accumulation – but only when accuracy is emphasized over speed.
16 subjects (11 females) were recruited from the University of California, San Diego (UCSD, La Jolla, CA) community. All had normal or corrected-to-normal vision. Each subject gave written informed consent per Institutional Review Board requirements at UCSD and completed a single 1 hour session in a climate and noise controlled subject room outside of the scanner and a single 1.5–2 hour session in the scanner. Compensation for participation was $10/hour for behavioral training and $20/hour for scanning. Subjects received an additional reward for task compliance according to a point system described below (mean additional compensation: $6.64). Data from two subjects were discarded due to improper slice stack selection during fMRI scanning that resulted in no data being collected from large portions of primary visual cortex, the main area of interest in this study.
Visual stimuli were generated using the Psychophysics Toolbox (Brainard, 1997; Pelli, 1997) implemented in Matlab (version 7.1; The Math Works, Natick, MA), presented at a frame rate of 60 Hz, and projected onto a screen at the entrance of the scanner bore that subjects viewed through a mirror. Button-press responses were made on an fMRI-compatible response box using the index and middle fingers of the right hand.
Subjects were shown a centrally presented oriented grating (with a diameter of approximately 14°) at full contrast which flickered at 6 Hz (83.33 milliseconds on, 83.33 milliseconds blank interval). On each trial, the orientation of the grating was pseudorandomly selected with equal probability from one of nine possible orientations (0°, 20°, 40°, 60°, 80°, 100°, 120°, 140°, 160°) with a small amount of pseudo-random jitter added (between ±0°–6°, selected from a uniform distribution). On half the trials, the same stimulus was presented at every ‘flicker’ (match trials), but for the remaining trials (mismatch trials), the orientation of the grating was offset by 5° on every other flicker, with the rotational offset of the deviant grating (i.e. clockwise or counterclockwise) fixed on a given trial and counterbalanced across trials (see Figure 1). The subject’s objective was to make a match/mismatch judgment by pressing one of two buttons held in the right hand. The order of match and mismatch trials was pseudo-randomized and counterbalanced within each run. Subjects were allowed to make a response any time after the stimulus onset; the stimulus was present for 3 seconds, after which it was replaced with a white centrally presented fixation circle for 3.5 seconds. We omitted all trials in which a subject failed to give a response (less than 1%) or emitted a response faster than 200 milliseconds (less than 1.05%). Since the second grating did not appear until 166.67 milliseconds into the trial, we reasoned that responses quicker than 200 milliseconds should be regarded as definite blind guesses. In total, a single run consisted of 50 trials (36 experimental trials and 14 null trials consisting of just a fixation circle) and lasted 336 seconds including an 11 second period of passive fixation at the end of each run. Across the 36 experimental trials, each of the 9 possible orientations was presented 4 times. Prior to each run, subjects were instructed by the experimenter to emphasize either response accuracy or speed. Subjects earned points based on their performance: +10 for correct responses on accuracy trials, −10 for incorrect responses on accuracy trials, +10 for correct responses on speed trials within the response deadline, +0 for incorrect responses on speed trials, −10 for any responses exceeding the response deadline. At the end of the experiment, subjects were paid an additional $1 for every 100 points earned during their performance while being scanned (rounded to the nearest dollar). During training in the lab, subjects were given trial-by-trial feedback, but feedback was delayed until the end of each run during the scanning session.
All participants were trained prior to the scan session for a minimum of 180 trials. During training, subjects received point feedback on a trial-by-trial basis according to the reward scheme outlined above. Participants practiced the task without any speed pressure until they felt comfortable and performed at approximately 90% accuracy. Subjects were then asked to repeat the task by responding as quickly as they could without guessing. The median of their RT distribution on this block was then set as their response deadline for both the subsequent speeded training blocks and the speeded blocks during the fMRI session.
Behavioral data were modeled using the LBA, which is a simplified version of the ballistic accumulator model and the leaky competing accumulator model (see Brown and Heathcote, 2005; 2008; Usher and McClelland, 2001). Figure 3 illustrates the LBA model schematically. On each trial, two racing accumulators begin with a random activation level (the starting point) that is independently drawn from a uniform distribution on [0, A]. Activity in each accumulator increases linearly, and a response is triggered as soon as one accumulator crosses the response threshold. The predicted response time is the time taken to reach the threshold, plus a constant offset time (non-decision time). The rate at which activation increases in each accumulator is termed the drift rate for that accumulator. These drift rates are drawn from independent normal distributions for each accumulator (with the standard deviation of these distributions being arbitrarily fixed at 1). The means of the normal distributions reflect the quality of the perceptual input. For example, a salient mismatch between the orientated gratings would lead to a large mean drift rate in the accumulator corresponding to a mismatch response (and vice versa). Hence, the LBA model estimates the mean of this drift rate distribution for each accumulator (“match” or “mismatch”).
The distance from the starting point to the response threshold is a measure of response caution as this distance quantifies the average amount of evidence that needs to be accumulated before a response is initiated. Changes in response caution are usually assumed to originate from adjusting the response threshold; however, adjusting the response threshold is mathematically equivalent to adjusting the starting point, therefore we chose to fix the height of the uniform distribution (A) from which the starting points were drawn (although the starting points nevertheless vary trial to trial; see also Ratcliff 1978; Ratcliff & Rouder, 1998; Forstmann et al., 2010; Van Maanen et al., 2011; Wolfe and Van Wert, 2010). As a result, we hereon use the response threshold to represent “response caution” since the maximum of the start point distribution is fixed across the SE and AE conditions.
The parameters of the LBA model were estimated using the method of maximum likelihood. Likelihood was optimized using simplex searches (Nelder and Mead, 1965). Initial parameter values for searches were generated two ways: using heuristic calculations based on the data, and using start points determined from the end points of searches for simpler, nested models (full details of these methods and extensive discussion of alternative approaches are provided by Donkin et al., 2011a). We fit the “match” and “mismatch” trials simultaneously, fixing all parameters between these two trial types to be constant except for the drift rate (which is presumably determined by the stimulus). Different drift rates were estimated for the accumulator corresponding to a “mismatch” response on trials with a “mismatch” stimulus (i.e., “correct” drift rate) and on trials with a “match” stimulus (i.e., “incorrect” drift rate; see Table 2). Similarly, different drift rates were estimated for the accumulator corresponding to a “match” response on trials with a “match” stimulus (“correct” drift rate) and on trials with a “mismatch” stimulus (“incorrect” drift rate; see Table 2). Each different design for constraining model parameters across task conditions was fit separately to each individual subject's data. One subject, however, only made one incorrect response among the AE mismatch trials, thereby providing little constraint on the model estimate for that condition. The parameter estimates for that subject were therefore set to the group average for that condition. The overall grouped BIC value provided very strong support for the design that allowed three parameters to vary between SE and AE conditions (response threshold (b), drift rate (v), and non-decision time (t0)). To quantify that support, we approximated posterior model probabilities based on BIC assuming a fixed effect for subjects (see Burnham and Anderson, 2002), which showed this design to be more than 1010 times more likely than the next best design (see Results section)
Response times and accuracy vary on each trial due to environmental changes and/or internal noise in a subject’s cognitive state. It is therefore important to not only map overall mean behavioral patterns with parameters that quantify relevant cognitive processes (as can be done with the standard LBA), but also to link estimates of these parameters and BOLD responses on a trial-by-trial basis.
In the standard LBA model (as in other decision making models, see Ratcliff, 1985; Ratcliff and Rouder, 1998), drift rates are normally distributed across trials, with different distributions for each respective accumulator. This assumption of normally distributed drift rates implies that drift rates which are close to the mean of the distribution are more likely than values from the tails of the distribution. In addition, the uniform distribution [0, A] restricts the range of starting points for each accumulator. These considerations yield the following maximum-likelihood estimates (MLE) for a single-trial drift rate (di) and a single-trial starting point (ai) given a trial with response time (ti):
where b, v, A, and t0 are the parameters estimated using the standard LBA that correspond to the response threshold (b), the drift rate (v), the height of the distribution of starting points (A), and the non-decision time (t0), respectively. Note that the assumed independence between estimated parameters that is found in the standard LBA model is no longer preserved with the STLBA. Nevertheless, parameter recovery studies show that the STLBA can explain much of the variance in the true parameter values (see the text surrounding Figure 3 in Van Maanen et al., 2011).
As in the main LBA analysis, we computed single-trial estimates of drift rate based on a model where response threshold (b), drift rate (v), and the non-decision time (t0) were free to vary between SE and AE trials, whereas the height of the uniform distribution of starting points (A) was fixed (see Table 2 for exact values). Constraining the model in other reasonable ways (e.g., fixing the non-decision time parameter) yielded qualitatively similar results. Note also that the single-trial estimates for the starting point here are mathematically equivalent to single-trial estimates of the response threshold since what is actually being calculated is the relative distance between the two.
All scanning was carried out on a General Electric MR750 3T scanner equipped with an 8-channel head coil at the W.M. Keck Center for Functional MRI on the main campus at UCSD. Anatomical images were acquired using a SPGR T1-weighted sequence that yielded images with a 1×1×1 mm resolution. Whole brain echo planar functional images (EPI) were acquired in 28 (for 8 of the subjects) or 26 (for the remaining subjects) oblique slices (TR = 1500 ms, TE = 30 ms, flip angle = 90°, image matrix = 64 × 64, field of view = 192 mm, slice thickness = 3 mm, 0 mm gap)
Data analysis was performed using BrainVoyager QX (v 1.91; Brain Innovation, Maastricht, The Netherlands) and custom timeseries analysis routines written in Matlab (version 18.104.22.1684; The Math Works, Natick, Massachusetts). Data from the main experiment were collected in 8 or 10 runs per subject (i.e., either 4 or 5 runs per response instruction type, respectively). EPI images were slice-time corrected, motion-corrected (both within and between scans) and high pass filtered (3 cycles/run) to remove low frequency temporal components from the timeseries. The timeseries from each voxel in each observer was then z-transformed on a run-by-run basis to normalize the mean response intensity across time to zero. This normalization was done to correct for differences in mean signal intensity across voxels (e.g., differences related to a voxel’s composition or by its distance from the coil elements). We then estimated the magnitude of the BOLD response on each trial by shifting the timeseries from each voxel by four 1.5 second TRs (6 seconds) to account for the temporal lag in the hemodynamic response function, and then extracting data from the two consecutive 1.5 second TRs that correspond to the duration of each 3 second trial (see Kamitani and Tong, 2005; Serences and Boynton, 2007a,b; Serences et al., 2009). The two data points extracted from these two consecutive TRs were then averaged together to compute a single estimate of the response in each V1 voxel on each trial. These trial-by-trial estimates of the BOLD response amplitude were subsequently used as inputs to the forward encoding model (see Estimating feature-selective BOLD response profiles using a forward encoding model).
Each subject participated in two runs of an independent functional localizer scan to identify voxels within primary visual cortex that were responsive to the spatial position occupied by the oriented grating stimulus employed in the primary experiment. The localizer stimulus was comprised of a full-contrast counter-phase modulated (8Hz) checkerboard that exactly matched the size of the oriented grating stimulus used in the main task. On each trial, the checkerboard stimulus was presented continuously for 10s, and the contrast of the checkerboard was reduced by 30% for a single video frame at 6 pseudo-randomly selected time-points. Subjects were instructed to make a button-press response with their right index finger every time they detected a contrast decrement. Each 10s trial was then followed by 10s of passive fixation. Visually responsive regions of primary visual cortex were identified using a general linear model (GLM) with a single regressor that was constructed by convolving a boxcar model of the stimulus sequence with a standard model of the hemodynamic response function (a difference-of-two gamma function model implemented in Brain Voyager, time to peak of positive response: 5s, time to peak of negative response: 15s, ratio of positive and negative responses: 6, positive and negative response dispersion: 1). Voxels were retained for analysis in the main experimental task if they passed a false discovery rate corrected single-voxel threshold of p<0.05
A meridian mapping procedure consisting of a checkerboard wedge flickering at 8 Hz and subtending 60° of polar angle was used to identify V1 (Engel et al., 1994; Sereno et al., 1995). Subjects were instructed to fixate on the center of the screen and to passively view the peripheral stimulus. The data collected during these scans was then projected onto a computationally inflated representation of each subject’s gray/white matter boundary. V1 in each hemisphere was then manually defined according to the representations of the upper and lower vertical meridian following standard practices (Wandell et al., 2007)
The goal of encoding models is to adopt an a priori assumption about the important features that can be distinguished using hemodynamic signals within an ROI, and then to use this set of features (or basis functions) to predict observed patterns of BOLD responses (Brouwer and Heeger, 2009, 2011; Dumoulin and Wandell, 2008; Gourtzelidis, et al., 2005; Kay and Gallant, 2009; Kay et a., 2008; Mitchell, et al., 2008; Naselaris, et al., 2009; Thirion, et al., 2006; reviewed in Naselaris, et al., 2011; Saproo and Serences, 2011). Here, we assumed that the BOLD response in a given V1 voxel represents the pooled activity across a large population of orientation selective neurons, and that the distribution of neural tuning preference is biased within a given voxel due to large-scale feature maps (Freeman et al., 2011) or to random anisotropies in the distribution of orientation-selective columns within each voxel (Kamitani and Tong, 2005; Swisher et al., 2010). Thus, the BOLD response measured from many of the voxels in V1 exhibit a robust orientation preference (Haynes and Rees, 2005; Kamitani and Tong, 2005; Serences et al., 2009; Brouwer and Heeger, 2011; Freeman et al., 2011)
To estimate orientation-selective response profiles based on activation patterns in V1, we first separated the data from the 8–10 scanning runs obtained for each subject into train and test sets using a ‘leave two-out’ cross-validation scheme (i.e., all but one SE and one AE run were used as a training set, and the held-out SE and AE runs were used as a test set). By holding one AE and one SE run out for use as a test set, we ensured that the training set had an equal number of trials of each type. For each run in the training set, we then computed the mean response evoked by each of the 9 orientations, separately for each voxel. The mean responses were then sorted based on stimulus orientation and run (i.e. mean response to orientation 1 was first, then orientation 2, …, orientation 9). Thus, each training set had 54 observations for subjects who underwent 8 runs in the scanner (6 runs in training set × 9 orientations), and 72 observations for subjects that underwent 10 runs in the scanner (8 runs in the training set × 9 orientations). Note that, as described below, data in the test set were not averaged across trials, and a unique channel response function was estimated for every trial.
Adopting the terminology of Brouwer and Heeger (2009; 2011), let m be the number of voxels in a given visual area, n1 be the number of observations in the training set (either 54 or 72), n2 be the number of trials in the test set, and k be the number of hypothetical orientation channels. Let B1 (m × n1 matrix) be the training data set, and B2 (m × n2 matrix) be the test data set. Under the assumption that the observed BOLD signal is a weighted sum of underlying orientation selective neural responses, we generated a matrix of hypothetical channel outputs (C1, k × n1) comprised of nine half-sinusoidal functions raised to the 6th power as a basis set (see Figure 4). The training data in B1 were then mapped onto the matrix of channel outputs (C1) by the weight matrix (W, m × k) that was estimated using a GLM of the form:
where the ordinary least-squares estimate of W is computed as:
The channel responses C2 (k × n2) were then estimated based on the test data (B2) using the weights estimated in (2):
The first steps in this sequence (equations 1–2) are similar to a traditional univariate GLM in that each voxel is assigned a weight for each feature in the model (in this case, one weight for each hypothetical orientation channel). Equation 3 then implements a multivariate computation because the channel responses estimated on each trial (in C2) are constrained by the estimated weights assigned to each voxel and by the vector of responses observed across all voxels on that trial in the test set. Thus, one key feature of this approach is that a set of estimated channel responses can be obtained on a trial-by-trial basis so long as the number of voxels is greater than the number of channels. If there are fewer voxels than channels, then unique channel response estimates cannot be derived as the number of variables being estimated exceeds the number of available measurements. This ability to estimate the orientation-selective tuning profile on each trial is exploited when comparing channel responses on correct and incorrect trials and when correlating channel responses with accuracy and drift rates on a trial-by-trial basis (see Results)
The shape of the basis functions used in C1 has a large impact on the resulting channel response estimates. In the present experiment, we used half-sinusoidal functions that were raised to the 6th power to approximate the shape of single-unit tuning functions in V1, where the 1/√2 half-bandwidth of orientation tuned cells is approximately 20° (although there is a considerable amount of variability in bandwidth, see Ringach et al., 2002a; Ringach et al., 2002b; Gur et al, 2005; Schiller 1976). Given that the half-sinusoids were raised to the 6th power, a minimum of seven linearly independent functions was required to adequately cover orientation space (Freeman and Adelson, 1991); however, since we presented nine unique orientations in the experiment, we used a set of nine evenly distributed functions. The use of more than the required seven basis functions is not problematic so long as the number of functions does not exceed the number of measured stimulus values, in which case the matrix C1 would become rank deficient. While we selected the bandwidth of the basis functions based on physiology studies, all results that we report are robust to reasonable variations in this value (i.e., raising the half-sinusoids to the 5th–8th power, all of which are reasonable choices based on the documented variability of single-unit bandwidths). Note that since the magnitude of the channel responses is scaled by the amplitude of the basis functions (which was set to 1 here), the units along the y-axes of all data plots are in arbitrary units. Importantly, however, scaling the basis functions to some other common value would not change the differential response between conditions.
Using this modeling approach, the center position of each function in the basis set can be systematically shifted across orientation space to estimate the response in a channel centered at any arbitrary orientation (as long as the channels remain linearly independent; Freeman and Adelson, 1991). While this method of shifting the center of each channel across orientation space could in principle be used to generate channel response profiles with a resolution of 1° (or even smaller), we opted to reconstruct the response functions in 5° steps as no additional insights were gained by estimating the responses at a higher resolution. After generating a channel response function on each trial in 5° steps across orientation space, each function was circularly shifted to a common stimulus-centered reference frame, and these re-centered response functions were averaged across left and right V1 and across all trials of a like kind. Thus, by convention the 0° point along the x-axis in all plots refers to the stimulus that evoked the response profile. Finally, since all channel response functions were found to be symmetrical about their center point, we averaged data from corresponding offsets on either side of the 0° point (i.e., data were averaged from the channels offset by +5° and −5° from the stimulus, +10° and −10°, and so forth) to produce the reported orientation tuning functions. Note that in the process of collapsing across channels centered on both positive and negative offsets from 0°, we necessarily collapsed across mismatch trials in which there was either a clockwise or a counterclockwise offset between sequentially presented gratings within a trial. However, sorting the data by the rotational offset of the deviant grating had no qualitative impact on any of our results, presumably because the two gratings were flickering back and forth on sequential presentations over the course of the 3s trial (see Figure 1) and because there was a random jitter of up to ±6° introduced on each trial (see task description above), which was on the same order as the offset between sequential gratings on mismatch trials (±5°)
Because the basis functions used to estimate channel responses overlapped – thus violating the independence assumption of traditional statistical tests – we estimated statistical significance using a non-parametric bootstrapping/randomization procedure. Note that this bootstrapping/randomization procedure was used for all comparisons related to BOLD tuning functions (see Figures 6 and and7,7, AE v. SE, correct AE v. incorrect AE, correct SE v. incorrect SE, the interaction between AE v. SE based on accuracy, AE logistic regression beta weights v. SE logistic regression beta weights, and single-trial correlations between AE responses and drift rate v. single-trial correlations between SE responses and drift rate). First, a series of standard paired t-tests was performed to determine which points along the two tuning curves differed significantly (using a threshold of p<0.05 for each individual t-test). We then generated a new data set by randomly selecting 14 participants with replacement and then re-assigning the condition label associated with data from each participant with a probability of 0.5. A series of paired t-tests was performed on the re-sampled and randomized data set using the same procedure applied to the observed data. This re-sampling plus randomization procedure was then iterated 10,000 times to determine the probability of obtaining the pattern of significant differences obtained using the intact data set under the null hypothesis that the two conditions are equivalent (i.e., interchangeable). The reported p-values in the main text thus reflect the proportion of times we observed a pattern of significant t-tests in the re-sampled data that matched the pattern obtained in the observed data. Note that the behavioral data were evaluated using conventional parametric statistical techniques.
Trials on which RTs were faster than 200 milliseconds were discarded in all subsequent analyses (including model fitting procedures described below, see Materials and Methods). Two-way repeated-measures analysis of variance (ANOVA) with factors for response-emphasis (speed vs. accuracy emphasis, or SE and AE trials, respectively) and trial type (match vs. mismatch) was used to assess accuracy and RT data collected during the scanning session (see Table 1 for a summary of the group data). The task instructions produced a strong SAT effect: participants responded faster on SE trials compared to AE trials (F(1,13)=39.168, p<0.001, Table 1), and there was a corresponding drop in accuracy on SE trials (F(1,13)=71.975, p<0.001, Table 1)
On average, subjects gave a “match” response 55% of the time, which is significantly greater than chance (one-sample t-test, t(13)=2.49, p=0.03). In addition, RTs were slower and accuracy slightly better on match trials compared with mismatch trials (F(1,13)=13.26, p<0.01; F(1,13)=5.4, p<0.05, Table 1), which is consistent with the bias to respond “match” over “mismatch” and commensurate with the well-known propensity for making confirmatory responses (Clark and Chase, 1972). There was an interaction between response-emphasis and trial type for RTs (F(1,13)=12.6, p<0.01, Table 1), with selectively long RTs on match AE trials. However, there was no interaction between response-emphasis and trial type for accuracy rates (F(1,13)=0.61, p=0.45, Table 1).
Accuracy rates and RTs might be lower on SE trials compared to AE trials due to differences in response caution and/or in the rate at which sensory evidence is accumulated. Therefore, we used a mathematical model of decision making (see Figure 3) to investigate how emphasizing either speed or accuracy influenced the rate of sensory evidence accumulation (as captured by the drift rate parameter) and response caution (as captured by the distance between the starting point and the response threshold, see Methods). Given that the neuronal mechanisms thought to support fine orientation discriminations are reasonably well-characterized (see Figure 2 and Predictions section below), we focused our analyses on mismatch trials (data from match and mismatch trials were nonetheless fit simultaneously, see LBA model analyses under Materials and Methods for more details). Eight different versions of the LBA model of Brown and Heathcote (2008) were fit by allowing all combinations of three different parameters (drift rate, response caution, and non-decision time) to either vary freely across SE and AE conditions or to be fixed across those conditions, while keeping the maximum starting point always fixed across AE and SE conditions (see Figure 3 and Linear Ballistic Accumulator model and LBA model analyses under Materials and Methods for more details). We then used the Bayesian Information Criterion (BIC) to select the most parsimonious of the eight models, which is a commonly used measure that evaluates the trade-off between model complexity and goodness of fit (Schwarz, 1978; Raftery, 1995). The model yielding the best BIC was the one that estimated different values for the parameters corresponding to response caution, drift rate, and non-decision time on AE trials compared to SE trials (see Table 2 for a summary of the parameter fits to data averaged over all the subjects). Based on approximated posterior model probabilities (see LBA model analyses under Materials and Methods for more details), this design was found to be more than 1010 times more likely than the next best design. Individually, this design was also the modal result: the BIC values for 6 out of 14 subjects preferred this design. Four subjects had best-BIC designs that included an effect on drift rate but not response threshold, while three had best-BIC designs that included an effect on threshold but not drift rate. Only one subject had a best-BIC design which included no effect at all of the experimental manipulation.
Figure 5 shows the fit of this best-BIC model to the cumulative response time distributions from the data. This figure estimates the distributions using quantiles plotted against response probability. These plots are also known as “defective cumulative distribution plots” and are a standard method for evaluating the quality of fit for response time models, as they provide a much more rigorous test than histograms (for introductions to this method and related discussion, see Ratcliff and Tuerlinckx, 2002 or Donkin et al., 2011a). The model fits the data quite well, matching the probability (as indicated by the height on the graph) of each response type in each condition accurately. The latency of each part of the response time distribution (abscissa axis) is also accurately captured by the model. For example, in the SE and AE conditions, the median observed correct RT differs from the median LBA predicted value by less than 25 milliseconds.
In any choice task, it is possible that participants occasionally make random guesses that are independent of the available stimulus information. This is especially a concern in the SE condition where error rates were relatively high. However, since the decision model fits the response time distributions from both conditions very well (see Figure 5, left panel), we assume that simple random guessing is not a plausible explanation for observed differences in parameters between the SE and AE conditions. Nevertheless, to avoid having the model results overly biased by contaminant processes such as guessing, we incorporated a mixture process with the assumption that each response had a 98% probability of arising from the LBA choice process, and a 2% probability of arising from a guessing process with random responses and uniform RT over the observed range (see Ratcliff and Tuerlinckx, 2002 and Donkin et al., 2011a, for details). With this built in assumption, the decision model fit the response time distributions from both conditions very well (Figure 5), consistent with the hypothesis that participants were making informed decisions on the vast majority of trials.
Consistent with most SAT studies, we observed a difference in response caution (Table 2; see Ratcliff, 1985; Ratcliff and Rouder, 1998; Voss et al., 2004; Palmer et al., 2005). Moreover, we observed a larger difference in the rate of evidence accumulation associated with correct and incorrect accumulators on AE trials compared with correct and incorrect accumulators on SE trials (F(1,13)=18.27, p<0.005, repeated measures two-way ANOVA, with no main effect of stimulus type nor interaction between response type and stimulus type, F(1,13)=2.82 and F(1,13)=0.95, respectively, p>0.10 for both). In the LBA, high accuracy occurs when the accumulator corresponding to the correct response for the stimulus gathers evidence more quickly than the accumulator corresponding to the incorrect response. The larger difference in drift rates between correct and incorrect accumulators on AE trials therefore suggests that sensory information about the correct response is being selectively accumulated at a higher rate when subjects make decisions in the absence of speed pressure. Such selectivity represents a departure from the typical assumption employed by mathematical psychologists that the rate of sensory evidence accumulation is fixed across AE and SE conditions (for extensive discussion, see Ratcliff and Rouder, 1998), as well as the typical assumption that response caution is the only cognitive process involved in the SAT. However, others have also observed evidence for a change in drift rates with task demands (Vandekerckhove et al., 2011) and we speculate that the effect may be even more apparent in the present task because subjects were engaged in a difficult perceptual discrimination in which the quality of sensory representations critically determined behavioral performance (see also Hübner et al., 2010 for a more theoretical treatment). Finally, the small observed differences in the time taken for non-decision processing between SE and AE conditions (see Table 2) are sometimes observed as a consequence of task instructions, but these differences are not usually of interest when the main purpose of the manipulation is to influence decision processing (see Starns and Ratcliff, 2010; Voss et al., 2004).
In general, the parameter estimates from the LBA model have been shown to be in agreement with the corresponding parameters in the Ratcliff diffusion drift model (Donkin et al., 2011b). Nevertheless, in order to demonstrate that our modeling results are not specific to our choice of model and fitting procedures, we also fit our behavioral data using the Diffusion Model Analysis Toolbox (DMAT) implemented in MATLAB (Vandekerckhove and Tuerlinckx, 2007; 2008). We used DMAT to fit the same eight models tested in our LBA analysis (i.e., all possible combinations of drift rate, response threshold, and non-decision time varying or staying fixed across AE and SE conditions while keeping all other parameters fixed). Group BIC values for each model design were calculated in the same manner as those computed for the LBA models (see LBA model analyses). Consistent with the results of the LBA model, the diffusion model design with the best BIC was the one that estimated different values for the parameters corresponding to drift rate and response threshold on AE versus SE trials (AE drift rates were larger than SE drift rates t(13)= 4.93, p<0.01, and AE response thresholds were higher than SE response thresholds t(13)=5.94, p<0.01). We then approximated posterior model probabilities using the group BIC value across all subjects for each model design. The model where only drift rate and response threshold varied yielded the greatest posterior probability (close to 1 while all other posterior probabilities were close to 0).
We next sought to establish a relationship between feature-selective BOLD responses in early visual areas and behavior. In situations that require discriminating between two highly similar stimuli (as in the present experiment where orientations on mismatch trials were offset by only 5°), neurons tuned to off-target orientations provide the most information about the presence of mismatching orientations (see Figure 2; Hol and Treue, 2001; Butts and Goldman, 2006; Jazayeri and Movshon, 2006; Purushothaman and Bradley, 2005; Navalpakkam and Itti, 2007; Regan and Beverley, 1985; Schoups et al., 2001; Scolari and Serences, 2009; 2010). Hence, we focused our analyses on mismatch trials in which the activation of off-target neurons is predicted to support such decisions. Given the relatively large difference in drift rates associated with correct and incorrect accumulators on AE mismatch trials (see Table 2), we predicted that correct mismatch AE trials should be associated with more activation in off-target neural populations compared to incorrect mismatch AE trials. The difference between the drift rates associated with the correct and incorrect accumulators on SE mismatch trials, on the other hand, is much smaller (see Table 2). We would therefore expect a small difference between off-target activation on correct SE mismatch trials compared to incorrect mismatch SE trials.
We used fMRI and a forward encoding model of BOLD responses (see Brouwer and Heeger, 2009, 2011; reviewed by: Naselaris et al., 2011; Serences and Saproo, 2011) to estimate how the SAT modulates orientation selective response profiles in V1 (see section entitled Estimating feature-selective BOLD response profiles using a forward encoding model under Materials and Methods). On mismatch trials, we first compared the BOLD-based orientation tuning functions (TFs) associated with AE trials with those associated with SE trials (Figure 6a) and found no significant difference in the shape of the curves (p=0.91; this and all subsequently reported p-values associated with TFs were estimated using a non-parametric randomization procedure due to the non-independence of adjacent data points, see Materials and Methods). However, when examining only the AE mismatch trials, we found a significant interaction between channel offset and behavioral accuracy (p<0.01, Figure 6b). In particular, responses in channels tuned approximately 25°–65° away from the target-orientation showed larger responses on correct trials compared to incorrect trials. The observation of more activation in these off-target channels on correct trials is consistent with our predictions, as these neural populations should better signal small changes in orientation. In turn, more gain in off-target populations should increase the quality of the information being sent to downstream decision mechanisms and thus increase the probability of a correct response (see Figure 2c). In contrast, no differences were observed between channel responses associated with correct and incorrect SE trials (p=0.90, Figure 6c), and the difference between off-target channel responses on correct and incorrect AE trials was significantly larger than the difference on correct and incorrect SE trials (p<0.01, Figure 6d). This interaction is consistent with the relatively large difference in drift rates associated with correct versus incorrect accumulators on AE trials compared with SE trials (see Table 2).
As shown in Figure 6b, we observed more activation in off-target channels on correct trials compared to incorrect trials in the AE condition. To further test the relationship between the magnitude of off-target responses and behavior, we performed a between-subject correlation between the change in drift rate and the change in off-target activation on correct and incorrect AE trials (where our measure of off-channel activation was the area between the TFs associated with correct and incorrect responses across channels tuned 25°–65° from the target orientation, see Figure 6b). Across subjects, larger differences between correct and incorrect accumulator drift rates were positively correlated with larger differences in off-target activation on correct compared to incorrect AE trials (Figure 7a, R2=0.36, t(12)=2.59, p<0.025). This relationship was still observed even when the total area between the TFs associated with correct and incorrect AE trials (i.e., from 0° to 90°) was correlated with the differential drift rates (R2=0.30, t(12)=2.24, p<0.05), demonstrating that the positive correlation did not strongly depend on the exact points along the TFs that were entered into the analysis. This between-subjects relationship between BOLD and behavior suggests that individual differences in the degree of off-target activation in V1 – and by inference, individual differences in the amount of information encoded about the orientation offset of mismatched stimuli – predicts the speed of evidence accumulation during decision making when subjects are not under speed pressure.
The correlation analysis presented in Figure 7a establishes a subject-by-subject relationship between off-target responses in V1 and the rate of sensory evidence accumulation. To further explore this relationship on a within-subject basis, we next used logistic regression to map fluctuations in the magnitude of the response in each orientation channel to accuracy on a trial-by-trial basis. A positive fit coefficient (beta coefficient) indicates that higher activation in a given channel predicts a higher probability of a correct behavioral response; negative beta coefficients indicate an inverse relationship between BOLD activation levels and behavioral performance. On AE mismatch trials, larger responses in channels tuned to the target (0° offset) were associated with a higher probability of incorrect responses, whereas larger responses in channels tuned approximately 40°–60° from the target were associated with a higher probability of a correct response (Figure 7b). In contrast, the beta coefficients on SE trials fluctuated around zero. This pattern gave rise to a significant cross-over interaction between the AE and SE beta coefficient curves (p=0.021, Figure 7b). As with the increased off-target activation on correct AE trials (Figure 6b) and the corresponding relationship with the rate of sensory evidence accumulation on a between-subject basis (Figure 7a), this trial-by-trial coupling between the magnitude of off-target channel responses and behavioral performance suggests that perceptual decisions are tightly coupled to activation levels across informative off-target sensory neurons, but only when subjects emphasize accuracy over speed.
Given the data presented thus far that off-target activation levels in V1 predict behavior on AE trials, we would also predict a positive correlation between trial-by-trial estimates of the rate of sensory evidence accumulation and the magnitude of the BOLD response in off-target channels. To evaluate this relationship, we correlated trial-by-trial estimates of off-target channel responses and trial-by-trial estimates of drift rates derived from the STLBA model on a within-subject basis.
As in the standard LBA model described above, we let the rate of sensory evidence accumulation (drift rate), response caution, and non-decision time vary freely across AE and SE conditions (and as in the standard LBA model, the height of the starting point distribution was fixed). Next, we estimated both channel responses and single-trial drift rates on each correct mismatch trial and then computed a correlation between these metrics across all trials for each subject. We observed larger correlation coefficients on AE compared to SE trials in channels tuned 30°–55° away from the target. This pattern produced a significant crossover interaction between task instruction and correlation coefficient (p=0.01, Figure 7c), and suggests that larger off-target responses selectively predict higher rates of evidence accumulation on AE trials. This finding converges with the prior analyses of both channel response amplitude (Figure 6b,d), between-subject correlation (Figure 7a), and within-subject logistic regression (Figure 7b), and is consistent with the idea that responses in informative sensory neurons are strongly coupled with behavioral performance, but only in the absence of speed pressure. However, this analysis more directly links trial-by-trial fluctuations in off-target channel responses with the rate of sensory evidence accumulation during decision making.
Note that the correlations shown in Figure 7c were expected to be small because both measures (model parameters and BOLD responses) are extremely variable when estimated on a trial-by-trial basis. Nevertheless, even though they were small in magnitude, the observed correlations were robust to reasonable changes in assumptions about how model parameters were constrained across conditions. For example, the same general pattern was observed using a variant of the STLBA model where the non-decision time was fixed across trials. These results were also specific to trial-by-trial estimates of the model parameter corresponding to the rate of sensory evidence accumulation: correlating V1 channel responses to raw response times or to trial-by-trial estimates of the parameter corresponding to response caution did not yield robust correlations. The selectivity of the correlations presented in Figure 7c thus illustrate the explanatory power of the rate of sensory evidence accumulation on the SAT data and further supports the relationship between optimal response patterns in V1 and decision making when subjects emphasize accuracy over speed.
When subjects emphasized accuracy, higher off-target activation levels predicted larger differential rates of sensory evidence accumulation (Figure 7a), logistic regression revealed a trial-by-trial relationship between behavioral accuracy and BOLD activation levels in off-target orientation channels (Figure 7b), and a model that provides trial-by-trial estimates of the latent cognitive processes involved in perceptual decision making (Van Maanen et al., 2011) revealed a correlation between activation levels in off-target channels and the rate of sensory evidence accumulation (Figure 7c)
The observation that off-target activation levels consistently predict behavioral performance on AE trials suggests that decision mechanisms can selectively pool inputs from the most informative sensory neurons (Purushothaman and Bradley, 2005; Law and Gold, 2009). However, this reliance on informative off-target channels during decision making only appears to happen on AE trials, as fluctuations in off-target responses do not predict behavior under speed pressure. This observation leads to an interesting prediction: given the low overall accuracy under speed pressure, we might have expected that off-target activation levels on SE trials more closely match off-target activation levels on incorrect AE trials (compare Figures 6a and 6b). Contrary to this prediction, we instead observed that tuning functions in the SE condition more closely resemble tuning functions on correct AE trials. This suggests that poor performance on SE trials is not related to low overall signal in off-target channels per se, but instead is caused by a failure to rely on informative populations of sensory neurons in an optimal manner during decision making. Although further investigation is clearly warranted, this apparent failure to rely on informative off-target neural responses on speeded trials may reflect a heuristic that enables a quick but imprecise readout of sensory information when response speed is at a premium.
One interpretation of the relationship between behavior and off-target modulations on AE trials holds that top-down attentional signals originating in frontal and parietal cortex differentially bias activation levels in off-target channels on a trial-by-trial basis. This type of attentional-feedback account is consistent by many theories of attentional control (reviewed in: Corbetta and Shulman, 2002; Desimone and Duncan, 1995; Kastner and Ungerleider, 2000; Noudoost et al., 2010; Serences and Yantis, 2006; Yantis, 2008) as well as recent evidence that the frontal operculum plays a causal role in governing attentional modulations in visual cortex and concomitant changes in performance across observers (Higo et al., 2010), and that sub-regions of frontal cortex mediate perceptual decisions (Purcell et al., 2010; Gold and Shadlen, 2007; Heekeren et al., 2004; de Lafuente and Romo, 2005; 2006; Lemus et al., 2010; Hernández et al., 2010; Ho et al., 2009; Kayser et al., 2010). However, since we did not directly manipulate attention in this study, it is difficult to dissociate sources of variability in V1 that are due to fluctuations in top-down biasing signals as opposed to sources of variability that are local to visual cortex. Future studies could more critically examine this issue by pairing a SAT task with either a valid or a neutral attention cue to determine if speed pressure selectively impairs a subjects’ ability to use prior information to appropriately bias population response profiles in visual cortex.
In addition to suboptimal usage of sensory information during decision making, it is likely that performance under speed pressure in our task is further limited by other neural mechanisms that operate outside of primary visual cortex. Several studies have found increased activation in the striatum when speeded responses are emphasized (Van Veen et al., 2008; Forstmann et al., 2008; Forstmann et al., 2010), consistent with a response threshold account that only motor and frontal areas are involved in mediating the SAT (Van Veen et al., 2008; Ivanoff et al., 2008; Forstmann et al., 2008, 2010; Wenzlaff et al, 2008; Ratcliff, 1985; Ratcliff and Rouder, 1998). In contrast, our findings provide support for the sensory-readout account, which posits that perceptual performance under speed pressure is also limited by how efficiently sensory information is integrated during decision making.
The forward encoding model that was used to estimate responses in different orientation channels is a proxy for the actual neural activity in the underlying populations of sensory neurons. This leads to an inevitable loss of resolution as a single V1 voxel contains many neural populations, and the bandwidth of V1 neurons can be highly variable. Therefore, it is difficult to pinpoint the exact orientation offset at which off-target BOLD modulations would peak given a perfectly optimal modulation of underlying neural responses. However, the observation of increased responses starting in channels tuned 25°–30° from the target is reasonable given the known tuning function properties of cells in V1 (see Ringach et al., 2002a; Ringach et al., 2002b; Gur et al, 2005; Schiller 1976). More generally, the robust relationship between off-channel activation levels and behavior supports the functional importance of the observed modulations, and is consistent with established models of optimal gain during fine-discriminations (Figure 2)
Generating channel tuning functions also depends critically on the ability of fMRI to reliably measure orientation-selective responses in primary visual cortex. In V1, it is likely that these feature-selective response biases depend to a large degree on relatively coarse maps of orientation space that unfold across the cortical surface (Freeman et al., 2011; Mannion et al., 2010; Leventhal, 1983; Sasaki et al., 2006; Schall et al., 1986; Zhang et al., 2011). For instance, there is a radial orientation bias in V1 (Freeman et al., 2011; Sasaki et al., 2006; Zhang et al., 2011). Thus, neurons with spatial receptive fields in (say) the upper right visual field tend to respond more to oblique orientations around 45°, and so on. Given the robust retinotopic organization of V1, this radial bias would generate an orderly representation of orientation across patches of cortex that represent each visual quadrant (Freeman et al., 2011; Sasaki et al., 2006; Zhang et al., 2011). In addition to this coarse orientation map across V1, voxel-level orientation selectivity may also reflect contributions from random anisotropies in the distribution of orientation selective columns within a voxel (Kamitani and Tong, 2005; Haynes and Rees, 2005; Swisher, et al., 2010; see Boynton, 2005 for a useful graphical illustration). Thus, there is growing evidence that the combination of BOLD fMRI and encoding models can be used to index feature-selective responses arising from neural signals at both coarse and fine spatial scales.
Despite this link, we do not claim that orientation selective response functions are solely related to neural spiking activity, as the BOLD signal is modulated by many sources including synaptic input from both local and distant inputs, tuned local field potentials, and even responses in astrocytes (Heeger et al., 2000; Heeger and Ress, 2002; Logothetis et al., 2001; Buxton, 2002; Logothetis and Wandell, 2004; Sirotin and Das 2009; Das and Sirotin, 2011; Handwerker and Bandettini, 2011a, 2011b; Jia et al., 2011; Kleinschmidt and Muller, 2010; Schummers et al., 2008). However, given that neurons in early sensory areas like V1 are massively interconnected (e.g., Douglas and Martin, 2007), changes in the BOLD signal related to synaptic activity should be highly correlated with changes in local spiking activity. Despite these caveats, the robust predictive relationship between off-target channel modulations and behavior strongly supports the functional significance of these indirect BOLD assays of neuronal activation.
The instruction dependent change in the reliance of decision mechanisms on off-target channels in V1 is consistent with other recent studies of perceptual decision making. For instance, Kahnt and colleagues (2011) found that training-related improvements in performance on a difficult perceptual discrimination task could be explained by a model in which sensory information is read out more effectively, thereby improving the representations of the decision variables leading up to the ultimate choice (see also: Law and Gold, 2008; 2009; Purushothaman and Bradley, 2005; Pestilli et al., 2011). Similarly, Rahnev et al. (2011) observed that manipulating prior expectation increased functional connectivity between posterior and frontal areas, consistent with an increase in the rate of sensory evidence transfer from earlier visual areas to putative decision mechanisms. Thus, the present results complement other recent studies that emphasize the importance of efficient sensory readout in perceptual decision making, and suggest that the optimality of readout breaks down under speed pressure.
We would like to thank Thomas Sprague for useful discussions. This work was supported by National Institutes of Mental Health (R01-MH092345 to J.T.S.).