Search tips
Search criteria 


Logo of cercorLink to Publisher's site
Cereb Cortex. 2016 April; 26(4): 1409–1420.
Published online 2014 October 19. doi:  10.1093/cercor/bhu236
PMCID: PMC4785939

The Role of Corticostriatal Systems in Speech Category Learning


One of the most difficult category learning problems for humans is learning nonnative speech categories. While feedback-based category training can enhance speech learning, the mechanisms underlying these benefits are unclear. In this functional magnetic resonance imaging study, we investigated neural and computational mechanisms underlying feedback-dependent speech category learning in adults. Positive feedback activated a large corticostriatal network including the dorsolateral prefrontal cortex, inferior parietal lobule, middle temporal gyrus, caudate, putamen, and the ventral striatum. Successful learning was contingent upon the activity of domain-general category learning systems: the fast-learning reflective system, involving the dorsolateral prefrontal cortex that develops and tests explicit rules based on the feedback content, and the slow-learning reflexive system, involving the putamen in which the stimuli are implicitly associated with category responses based on the reward value in feedback. Computational modeling of response strategies revealed significant use of reflective strategies early in training and greater use of reflexive strategies later in training. Reflexive strategy use was associated with increased activation in the putamen. Our results demonstrate a critical role for the reflexive corticostriatal learning system as a function of response strategy and proficiency during speech category learning.

Keywords: category learning, fMRI, corticostriatal systems, speech, putamen


What neural mechanisms underlie language acquisition in adulthood? Learning speech sounds of a new language is argued to be a difficult category learning problem in adulthood. For instance, native Japanese speakers find it difficult to learn to categorize English /r/ versus /l/ sounds (Iverson et al. 2003). This difficulty is likely due to the high variability and multidimensional nature of speech categories (Hillenbrand et al. 1995; Jongman et al. 2000; Vallabha et al. 2007; Holt and Lotto 2008, 2010). Adequate feedback can significantly enhance speech category learning in adults (McCandliss et al. 2002; McClelland and Patterson 2002; Norris et al. 2003; Goudbeek et al. 2008). Trial-by-trial feedback is therefore ubiquitously used in speech training paradigms. However, little is known about the neural mechanisms underlying feedback-based error reduction in speech learning (Holt and Lotto 2008, 2010). Understanding the neural mechanisms mediating feedback-based learning is critical because subtle variations in feedback characteristics can significantly modulate speech learning rates (Chandrasekaran et al. 2014b). Furthermore, it would contribute to our general knowledge of the neural mechanisms involved in learning a second language.

Outside the speech domain, previous research examining visual category learning has identified at least two partially dissociable neural systems that process feedback: a reflective system, wherein processing is under conscious control, and a reflexive system that is not under conscious control (Ashby and Alfonso-Reese 1998; Poldrack and Packard 2003; Ashby and Ennis 2006; Nomura et al. 2007; Seger and Miller 2010). The reflective system, also referred to as the rule-based learning system in the literature, uses working memory and executive attention to develop and test verbalizable rules based on feedback (Maddox and Ashby 2004). It relies on an executive corticostriatal loop that primarily involves the dorsolateral prefrontal cortex (DLPFC), head of the caudate nucleus, the anterior cingulate cortex, and the hippocampus. These brain regions contribute to the generation, selection, and maintenance of verbalizable rules. In contrast, the reflexive learning system, also referred to as the procedural-based learning system, is not consciously penetrable, nonverbalizable, and operates by associating perception with actions that lead to immediate reward (Maddox and Chandrasekaran, 2014; Chandrasekaran et al. 2014a; Maddox et al. 2014). During reflexive learning, a single medium-spiny neuron in the striatum implicitly associates an abstract motoric response with a group of sensory cells. Learning occurs within cortical–striatal synapses, wherein plasticity is facilitated by a reinforcement signal from the ventral striatum (Ashby and Ennis 2006; Seger 2008). A recent study examining visual category learning showed that the putamen is critical in reflexive learning (Waldschmidt and Ashby 2011). Animal research has shown that both the reflective and reflexive circuitries receive direct input from several auditory regions (Reale and Imig 1983; Yeterian and Pandya 1998). While the role of the reflective auditory loop has been extensively studied (Romanski et al. 1999; Rauschecker and Scott 2009), much less is known about the role of the reflexive learning system in speech processing.

In the current study, we examined the hypothesis that optimal speech category learning is mediated by the neural circuitry underlying the reflexive learning system. We hypothesized that reflective learning of speech categories is difficult due to the multidimensional nature and high variability of speech categories. In addition, dimensions underlying speech categories are integral and often difficult to verbalize (Lisker 1986; Hillenbrand et al. 1995; Jongman et al. 2000; Vallabha et al. 2007; Holt and Lotto 2008, 2010). By definition, it is difficult to selectively attend to integral dimensions stimuli (Shepard 1964; Garners 1974; Ashby 1992a). Indeed, when the mode of stimulus presentation and the nature of the trial-by-trial feedback were manipulated in a recent behavioral study examining speech learning (Chandrasekaran et al. 2014b), learning was enhanced under conditions that were previously shown to augment reflexive learning in the visual domain (Maddox et al. 2003,2008). Computational modeling of behavioral data collected in a similar learning paradigm revealed that optimal speech category learning is associated with initial use of reflective strategies followed by a transition to the use of reflexive strategies (Maddox and Chandrasekaran 2014).

Despite this growing body of evidence which suggests that speech category learning is reflexive, there currently is no neural evidence of the relative role of the two learning systems in speech categorization. To this end, we employ a combination of behavioral, neural, and computational modeling methods to evaluate the mechanisms underlying feedback-dependent speech category learning. Specifically, we predict that optimal speech category learning will be associated with increased processing in the putamen, which is hypothesized to be involved in a “motor loop” that implicitly associates stimuli with category responses within the motor cortex. We use an individual differences approach as well as computational modeling to assess the mechanistic link between learning and computations within the domain-general learning systems. Adult native speakers of English (N = 23) learned novel speech categories (Mandarin tone categories, Fig. Fig.1)1) while blood oxygen level-dependent (BOLD) responses were collected. Participants made a category response to each stimulus, which resulted in positive or negative feedback. Neural activation during stimulus presentation and feedback processing were separately estimated using an optimized rapid event-related design. Behavioral accuracies were calculated and decision-bound models were applied at the level of individual participants to provide a window into cognitive processing and the computational strategies employed at different stages of category learning.

Figure 1.
(Left) Stimulus space. 2 native Mandarin talkers produced 4 Mandarin lexical tones in 5 syllable contexts. The x-axis represents normalized average fundamental frequency height (“pitch height”) of each stimulus. The y-axis represents normalized ...

Materials and Methods


Native speakers of American English (age: 18–35; n = 25; 14 females) were recruited from the University of Texas at Austin community. Participants self-reported as being right-handed and passed a hearing screening examination (pure tone thresholds < 25 dB HL at 1, 2, and 4 kHz). Further, participants had no prior exposure to a tonal language, as determined by an abbreviated form of the LEAP-Q (Marian et al. 2007). Potential participants were excluded if they reported a current or past history of major psychiatric conditions, neurological disorders, hearing disorders, head trauma, or use of psychoactive drugs or psychotropic medication. Data from 2 male participants were excluded from all analyses due to file corruption or an incidental finding on the structural scan. The University of Texas at Austin IRB approved the experimental protocol.


Natural exemplars (N = 40) of the 4 Mandarin tones (high-flat, low-rising, high-falling, low-dipping) were produced in citation form by 2 native Mandarin speakers (originally from Beijing; 1 female) in the context of 5 monosyllabic Mandarin Chinese words (/bu/, /di/, /lu/, /ma/, /mi/). These syllables were chosen because they also exist in the American English inventory. The stimuli were normalized for RMS amplitude of 70 dB and duration of 0.4 s (Wong et al. 2009; Perrachione et al. 2011). Five independent native speakers correctly identified the 4 tones (>95%) and rated the stimuli as highly natural.


Participants performed a category learning task in the scanner while listening to the speech sounds presented through headphones. Visual stimuli including the instructions and feedback were displayed via the in-scanner projector visible using a mirror attached onto the head coil. Participants were equipped with a 2-button response box in each hand. Prior to scanning, participants underwent a brief training procedure in which they familiarized themselves with the association of keys to 4 possible responses. Tone learning procedures closely followed a previous study on visual category learning in the scanner (Nomura et al. 2007). The experiment consisted of 6 contiguous scans, or “learning blocks”. Prior to each block, participants were instructed to attend to the fixation cross on the screen. During each trial, an auditory stimulus was presented for 445 ms. Participants were instructed to categorize the sound into 1 of 4 categories. They were encouraged to guess even if they did not know the answer. Following a jittered stimulus–feedback interval, corrective feedback (“RIGHT” versus “WRONG”) was displayed for 750 ms (Fig. (Fig.1).1). If the participant failed to respond within the 2 s following stimulus onset, the response did not register and a cautionary feedback display was presented (“TIME”). Each stimulus was presented once within each block. The presentation order of the stimuli was pseudorandomized into a sequence common for all participants but different across learning blocks.

Scan Parameters

The participants were scanned using the Siemens Magnetom Skyra 3T MRI scanner at the Imaging Research Center of the University of Texas at Austin. High-resolution whole-brain T1-weighted anatomical images were obtained via MPRAGE sequence (repetition time [TR] = 2.53 s; echo time [TE] = 3.37 ms; field of view [FOV] = 25 cm; 256 × 256 matrix; 1 × 1 mm voxels; 176 axial slices; slice thickness = 1 mm; distance factor = 0%). T2*-weighted whole-brain blood oxygen level-dependent (BOLD) images were obtained using a gradient-echo multi-band EPI pulse sequence (flip angle = 60°; TR = 1.8 s; 166 repetitions; TE = 30 ms; FOV = 25 cm; 128 × 128 matrix; 2 × 2 mm voxels; 36 axial slices; slice thickness = 2 mm; distance factor = 50%) using GRAPPA with an acceleration factor of 2. To separately estimate neural responses to the stimulus from the response to the feedback, the stimulus–feedback and feedback–stimulus intervals were randomly jittered using samples from a uniform distribution (stimulus–feedback: 2–4 s; feedback–stimulus: 1–3 s; Fig. Fig.1;1; Dale 1999; Liu et al. 2001; Birn et al. 2002).

Behavioral Analysis


Each participant's response on each trial was coded as “correct” or “incorrect,” with the missed trials also being coded as incorrect. A mixed logit analysis was conducted to estimate the log odds of producing a correct response, using lmer (Bates et al. 2012). The fixed effect of interest was the number of the blocks (1–6) mean-centered to 0 (−2.5, −1.5, −0.5, 0.5, 1.5, 2.5). The model was corrected for by-participant random slopes for each block and the random intercept for each block.

Model Fitting Approach

The model fitting approach closely followed the methodology published in Maddox and Chandrasekaran (2014) and in other applications to speech and vision (Maddox 2002; Maddox et al. 2013, 2014; Maddox and Filoteo 2011; Chandrasekaran et al. 2014a). We fit each model on a block-by-block basis separately to the data from each participant to circumvent misleading interpretations from fits to aggregate data (Estes 1956; Ashby et al. 1994; Maddox 1999). We assumed that the 2-dimensional space (pitch height vs. pitch direction) displayed in Figure Figure11 accurately describes the perceptual representation of the stimuli. Previous multidimensional scaling studies suggest that these 2 dimensions explain a significant percentage of variance (Chandrasekaran et al. 2007). Based on the results from our earlier work (Maddox and Chandrasekaran 2014), we also assumed that participants applied category learning strategies separately to the male and female perceptual spaces (Fig. (Fig.1).1). We explored 3 classes of models: reflexive, reflective, and a random responder model. The model parameters were estimated using maximum likelihood procedures (Wickens 1982; Ashby 1992b). Model fits were compared using Akaike weights to determine the best fitting model for each participant in each block of trials (Wagenmakers and Farrell 2004; modeling analyses were also conducted using the Bayes Information Criterion (BIC). In every case the results mirrored those reported with AIC. We provide the results using BIC in the Supplementary Material).

The reflexive learning system was modeled using the Striatal Pattern Classifier (SPC; Ashby and Alfonso-Reese 1998; Maddox et al. 2002; Seger and Cincotta 2005; Ashby and Ennis 2006; Nomura et al. 2007). The model reflects the many-to-one mapping from the primary and secondary auditory cortices along the superior temporal gyrus to the striatum (Yeterian and Pandya 1998), where a low-resolution map of the perceptual space is represented among different striatal units. Category learning involves associating each category label with a cluster of striatal medium-spiny neurons (Hikosaka et al. 1989; Wilson 1995; Arnauld et al. 1996; Yeterian and Pandya 1998; Ashby and Ennis 2006). We model this association by assuming that each category is represented by a striatal “unit” in the pitch height–pitch direction space. The SPC assumed 4 striatal units in the 2-dimensional pitch height–pitch direction space for the male speakers and a separate 4 striatal units in the pitch height–pitch direction space for the female speakers. The SPC contained 6 free parameters in each space: 5 that determine the location of the units, and one that represents the noise associated with the placement of the striatal units. The versions of SPC have already been applied in an artificial auditory category learning task (Maddox et al. 2006), vowel categorization task (Maddox et al. 2002), and Mandarin lexical tone learning (Maddox et al. 2013, 2014). It is important to note that the SPC is a computational model inspired by what is known about the neurobiology of the striatum. Because of this fact, the striatal “units” are hypothetical and could be interpreted within the language of other computational models (e.g., as “prototypes” in a multiple-prototype model like SUSTAIN; Love et al. 2004).

A series of unidimensional reflective models was also fit to the data. The unidimensional reflective models assumed that the participant set 3 criteria along the pitch height or pitch direction dimension, ignoring the other dimension. The unidimensional height model assumed that the 3 criteria along the pitch height dimension were used to separate the stimuli into low, medium-low, medium-high, or high pitch height, each of these being associated with one of the tone categories, while ignoring the pitch direction dimension. Although a large number of versions of this model are possible, we explored the 8 variants of the model that made the most reasonable assumptions regarding the assignment of category labels to the 4 response regions. Using the convention that the first, second, third, and fourth category labels are associated with low, medium-low, medium-high, and high pitch height, respectively, the 8 variants were: 3214, 3412, 3241, 3421, 2314, 4312, 2341, and 4321. The unidimensional direction model assumed that the 3 criteria along the pitch direction dimension were used to separate the stimuli into low, medium-low, medium-high, or high pitch direction, each of these being associated with one of the tone categories, while ignoring the pitch height dimension. Although a large number of versions of this model are possible, we explored the 2 variants of the model that made the most reasonable assumptions regarding the assignment of category labels to the 4 response regions. Using the convention that the first, second, third, and fourth category labels are associated with low, medium-low, medium-high, and high pitch direction, respectively, the 2 variants were: 4312 and 4132. The unidimensional models each contained 4 free parameters in each space: 3 criteria and one noise parameter. The random responder model assumed a fixed probability of responding tone 1, tone 2, tone 3, and tone 4, allowing for response biases. The model had 3 free parameters in each space to reflect the predicted probability of responding “1,” “2,” or “3”, the probability of responding “4” being equal to 1 minus the sum of the other 3.

Finally, a more complex conjunctive reflective model was also considered in a secondary analysis. In previous behavioral pilot work, we elicited verbal descriptions of the 4 categories after the category learning task. No participant reported a conjunctive reflective strategy, although several described unidimensional strategies. However, since a conjunctive model is theoretically possible, we conducted a separate analysis using this model as a possibility. The model assumed that the 2 criteria along the pitch direction dimension are used to separate the stimuli into falling, flat, or rising pitch direction. Falling pitch direction items are classified as tone category 4 and rising pitch direction items as tone 2. If an item is classified as flat pitch direction, the pitch height dimension is examined. The single criterion along the pitch height dimension is used to separate the stimuli into low and high pitch height. Stimuli that have flat pitch direction and high pitch height are classified as tone 1 and flat pitch direction items of low pitch height as tone 3. This model contained 4 free parameters in each space: 3 criteria and one noise parameter. Inclusion of this model did not alter the main findings of the study, and therefore we only present the findings of this secondary analysis as Supplementary Material.

To assess the strategy selection by participants over the course of learning blocks, a linear mixed effects analysis was applied to the set of best fitting models per block for each participant (Bates et al. 2012). Mean-centered block numbers were included as the dependent variable, with the best fitting strategy being the fixed effects (the reflexive model as the reference), corrected for by-participant random intercepts. Finally, an analysis was run to examine whether reflexive strategy use was associated with better learning than nonreflexive strategies. The dependent variable was trial-by-trial accuracy. The fixed effects were the mean-centered block numbers and whether the participant was using a reflexive strategy. The model was corrected for a random intercept of each participant, as well as the random slope of block by strategy interaction for each participant.

fMRI Preprocessing

fMRI data were analyzed using FMRIB's Software Library Version 5.0 (Smith et al. 2004; Woolrich et al. 2009; Jenkinson et al. 2012). BOLD images were motion corrected using MCFLIRT (Jenkinson et al. 2002). All images were brain-extracted using BET (Smith 2002; Jenkinson et al. 2005). Registration to the high-resolution anatomical image (df = 6) and the MNI 152 template (df = 12; Grabner et al. 2006) was conducted using FLIRT (Jenkinson and Smith 2001; Jenkinson et al. 2002). Six separate block-wise first-level analyses were run within-subject. The following prestatistics processing was applied: spatial smoothing using a Gaussian kernel (FWHM = 5 mm); grand-mean intensity normalization of the entire 4D dataset by a single multiplicative factor; high-pass temporal filtering (Gaussian-weighted least-squares straight line fitting; σ = 50.0 s). Each event was modeled as an impulse convolved with a canonical double-gamma hemodynamic response function (phase = 0 s). Motion estimates were modeled as nuisance covariates. Temporal derivative of each event regressor, including the motion estimates, was added. Time-series statistical analysis was carried out using FILM with local autocorrelation correction (Smith et al. 2004). The events of interest were stimulus, response and feedback, which were further subdivided according to the accuracy valence: correct, incorrect, and missed. The missed trials were treated as nuisance variables.

Whole-Brain Analysis

First-level analysis results were committed to second-level analysis using fixed effects with 3 regressors: group average, mean-centered block numbering, and mean-centered accuracy per block per participant. The latter 2 regressors were included as nuisance variables to counteract systematic trends in the data across multiple blocks. Third-level group analysis was performed for each contrast using FLAME1 (Woolrich et al. 2009). Poststatistical analysis was performed using randomise in FSL to run permutation tests (n = 50 000) for the GLM and yield in threshold-free cluster enhancement (TFCE) estimates of statistical significance (Freedman and Lane 1983; Kennedy 1995; Bullmore et al. 1999; Anderson and Robinson 2001; Nichols and Holmes 2002; Hayasaka and Nichols 2003). Finally, in order to assess the activation patterns associated with optimal learning, the first-level analysis from the final block was committed to a second-level analysis with 2 regressors of group average and mean-centered accuracy for the final block (Table (Table11).

Table 1
Whole-brain analysis

Region of Interest Analysis

Four Region of Interests (ROIs), chosen a priori, were defined: (1) left and right DLPFC and (2) left and right putamen. The DLPFC were anatomically defined using the Brodmann areas 9/46 (Spence et al. 2000; Pochon et al. 2002; Curtis and D'Esposito 2003; Anderson et al. 2004) per the atlas included in the MRIcron package (Rorden 2007). The putamen was anatomically defined using the Harvard–Oxford Subcortical Atlas (Frazier et al. 2005; Desikan et al. 2006; Makris et al. 2006; Goldstein et al. 2007). The masks were linearly registered to the MNI152 space (Grabner et al. 2006) using FLIRT (Jenkinson and Smith 2001; Jenkinson et al. 2002; Fig. Fig.5).5). Percent signal changes in the (correct − incorrect) contrast for feedback processing were calculated by first linearly registering the ROIs to the individual BOLD spaces using FLIRT with the appropriate transformation matrices generated from the first-level analysis and nearest neighbor interpolation (Jenkinson and Smith 2001; Jenkinson et al. 2002). Then, the contrast parameter estimate images were masked for the transformed ROIs, multiplied by the height of the double-gamma function for the stimulus length of 1 s (0.0288), converted into percent scale, divided by mean functional activation, and averaged within the ROI using fslmaths (Mumford 2007).

Figure 5.
Correct versus incorrect activation correlated with individual accuracy scores in the final block, during stimulus presentation phase. Activities in the bilateral Heschl's gyri, right inferior parietal lobule, right inferior frontal gyrus, bilateral insula, ...


Behavioral Results


The average performance for the initial block was 23% (standard deviation [SD] = 9%; 95% confidence interval [CI] [19%, 26%]), close to the chance level of 25%. By the final block, average performance was 54% (SD = 27; 95% CI [42%, 66%]). The performance in the initial and final blocks were positively correlated, r(21) = 0.432, P = 0.040, 95% CI [0.024, 0.716]. A mixed effects analysis was conducted to assess the learning progress. The dependent variable was trial-by-trial accuracy (correct vs. incorrect), and the fixed effect was the mean-centered block number. The intercept was not significant, b = −0.16, standard error (SE) = 0.23, z = −0.72, P = 0.47, 95% CI [−0.63, 0.30]. The effect of the mean-centered block number was significant, b = 0.32, SE = 0.70 z = 4.61, P < 0.0001, 95% CI [0.18, 0.47], indicating an overall learning effect across blocks (Fig. (Fig.22).

Figure 2.
(Left) Individual behavioral learning performance across all blocks, for all participants. Each cell represents the learning profile for individual participants. The x-axis shows the learning blocks. The y-axis shows the average response accuracy per ...

Model-Based Analyses

Participants were observed to use various strategies, including all models considered in the process. As described above, for each block, a strategy (reflexive, reflective pitch height, reflective pitch direction, or random responder) was assigned according to the best fitting model. Several notable patterns could be verified from the mixed effects analysis estimating the average block number for each of the assigned strategies. Since the block numbers were mean-centered, positive estimates for each level of strategy would indicate that the given strategy was more likely to be utilized late in learning (block 4, 5, or 6), while negative estimates would indicate that the given strategy was more likely to be utilized early in learning (block 1, 2, or 3). The mean block for the reflexive strategy (intercept) was significant, b = 0.79, SE = 0.25, t = 3.10, P = 0.0024, 95% CI [0.29, 1.28], indicating that a given reflexive strategy was more likely to be utilized late in learning. The random responder model was significant, b = −1.29, SE = 0.36, t = −3.60, P = 0.00044, 95% CI [−1.98, −0.59], indicating that the random responder strategy was more likely to be utilized in learning earlier than the reflexive strategy. Similar patterns were observed to be statistically significant for unidimensional reflective strategies, which were utilized earlier in learning than the reflexive strategy: pitch direction, b = −1.29, SE = 0.60, t = −2.13, P = 0.035, 95% CI [−2.46, −0.11]; pitch height: b = −0.94, SE = 0.35, t = −2.66, P = 0.009, 95% CI [−1.63, −0.25]. Taken together, these results indicate that the slow-learning reflexive strategy was more likely to be utilized late in learning, whereas the fast-learning reflective or random responder strategies were more likely to be utilized early in learning (Fig. (Fig.2).2). In an analysis designed to test whether the reflexive strategies yielded better learning outcomes, a logistic regression was conducted with the trial-by-trial accuracy as the dependent variable and the mean-centered block number, block-by-block strategy, and their interaction term as fixed effects. There were 2 levels in the block-by-block strategy term: reflexive versus nonreflexive (reference level). There was a nonsignificant interaction between block number and strategy, b = −0.96, SE = 0.84, z = −1.15, P = 0.25. Therefore, we focused on a model that only included the main effects. For the average block number (between 3 and 4), the log odds of producing an accurate response compared with an inaccurate response for the nonreflexive strategy was negative, b = −0.37, SE = 0.17, z = −2.18, P = 0.030, indicating the probability of an accurate response was significantly below 50%. The block effect was significant, b = 0.26, SE = 0.057, z = 4.58, P < 0.0001, indicating that the odds of producing an accurate response compared with an inaccurate response was higher for later blocks than for earlier blocks. The strategy effect was significant, b = 0.39, SE = 0.19, z = 2.07, P = 0.038, indicating that reflexive strategy use, compared with nonreflexive strategy use, was associated with increased odds of producing an accurate response compared with an inaccurate response. These results suggest that learning improved over time, and that reflexive strategy use was associated with better learning than nonreflexive strategies.

Whole-Brain Analysis

Feedback Processing

Averaging across correct and incorrect responses (correct + incorrect) did not yield any significant activations associated with feedback processing. Testing whether the activation for correct trials was higher than for incorrect trials (correct − incorrect; Fig. Fig.3)3) yielded areas associated with the corticostriatal loops involved in category learning (Seger 2008, 2010). The ventral striatum including the nucleus accumbens was activated, as well as the anterior cingulate cortex. These 2 areas form a part of the motivational loop that processes reward value in feedback, which is greater in positive than negative feedback. The left dorsolateral prefrontal cortex and the left head of caudate were activated, which are parts of the executive loop that form the basic circuitry underlying reflective learning. The bilateral putamens were activated, which are involved in the categorization process via the connection to the motor regions. The left inferior parietal lobule was activated, which functions as the sensorimotor interface that maps sensory speech information onto articulatory gestures (Hickok and Poeppel 2007). Finally, the left middle temporal gyrus/superior temporal sulcus region was activated. During feedback processing, there was no meaningful auditory stimulus to be processed, and the level of auditory sensory input was identical across positive and negative feedback. Therefore, the activation in the superior temporal area as well as the inferior parietal lobule was presumably not driven by the auditory stimulus alone but reflects feedback-driven strengthening of stimulus-to-response/category association (Weil et al. 2010). No brain region showed significantly higher activation for incorrect trials than for correct trials (incorrect − correct).

Figure 3.
Activation during feedback processing phase, correct versus incorrect accuracy valence. Activities in left dorsolateral prefrontal cortex, anterior cingulate cortex, left caudate nucleus, bilateral putamens, ventral striatum, left middle temporal gyrus/superior ...

Stimulus Presentation

Averaging across the accuracy valence (correct + incorrect), stimulus presentation was found to elicit activation in the bilateral Heschl's gyri, planum temporales, and the posterior superior temporal gyri concurrent with the auditory nature of the task. Activation for correct trials was higher than for incorrect trials in the right planum temporale and the insular cortex, and the left pre- and postcentral cortices (correct − incorrect; Fig. Fig.4).4). Also, the right inferior parietal lobule was shown to be sensitive to accurate categorization, consistent with its proposed role as the sensorimotor interface between auditory processing and articulatory mapping (Hickok and Poeppel 2007). No brain regions showed higher activation for incorrect trials than for correct trials (incorrect − correct).

Figure 4.
Activation during stimulus presentation phase, correct versus incorrect accuracy valence. Activities in the left precentral gyrus, postcentral gyrus, superior parietal lobule, right insula, planum temporale and inferior parietal lobule are observed. The ...

In order to assess the activation patterns associated with optimal learning, the final block contrast for correct trials relative to incorrect trials (correct − incorrect) during stimulus perception was regressed against the accuracy scores from the final block. Following this analysis, individual accuracy scores were found to positively correlate with increased activation in the speech processing areas of the bilateral Heschl's gyrus, right inferior parietal lobule, right inferior frontal gyrus, and the bilateral insula. Additionally, higher accuracy was also associated with increased activation in the bilateral putamen, right caudate nucleus, the motor cortex, and the anterior cingulate cortex, suggesting that better performance in the final block was related to the involvement of the corticostriatal learning systems, and in particular, the motor loop encompassing the motor cortex and the putamen (Fig. (Fig.5).5). No brain regions showed negative correlation with the accuracy scores.

Category Response

Averaging across correct and incorrect responses (correct + incorrect), the activation associated with category response involved several areas within the extensive cortical networks. The bilateral pre- and postcentral areas were activated, reflecting finger movements necessary for making category responses. The decision making network involving the left dorsolateral prefrontal cortex and the anterior cingulate cortex were activated, reflecting the categorization process during response selection. The activation for correct and incorrect trials did not significantly differ (correct − incorrect; correct + incorrect).

ROI Analysis

Reflexive Strategy Use and Increased Putamen Activation

This analysis tested the hypothesis that the putamen is involved when category learning is mediated by the reflexive processing system. Participants were classified as reflexive versus nonreflexive (reflective or random) strategy users based on the best fitting model in each block. Mixed effects analyses were performed on the putamen and DLPFC on the left and right hemispheres. The dependent measure was the percent signal change (correct − incorrect) value during feedback processing in each block. The fixed effects were the mean-centered block number, and strategy group (reference level: nonreflexive), corrected for random participant intercepts. In the left putamen, the block by strategy interaction was not significant, b = −0.15, SE = 0.82, t = −1.79, P = 0.076, 95% CI [−0.31, 0.013]. Therefore, we investigated the model without the interaction. The strategy effect was significant, b = 0.18, SE = 0.76, t = 2.31, P = 0.023, 95% CI [0.027, 0.324], suggesting that reflexive strategy use was associated with increased activation in the putamen for positive feedback processing relative to negative feedback processing, although Bonferroni correction for the number of ROIs (n = 4) renders this effect only marginally significant (corrected P = 0.091). The block effect was not significant, b = −0.058, SE = 0.035, t = −1.65, P = 0.10, 95% CI [−0.13, 0.011]. The intercept was not significant, b = 0.054, SE = 0.041, t = 1.33, P = 0.19, 95% CI [−0.025, 0.13]. No effects were significant in other ROIs (Fig. (Fig.66).

Figure 6.
(Left) Dorsolateral prefrontal cortex (DLPFC) defined as Brodmann areas 9/46, and the putamen anatomically defined using the Harvard–Oxford Subcortical Atlas. (Right) Reflexive strategy use associated with increased activation in the left putamen ...


We examined the neural mechanisms underlying nonnative speech category learning in adults. Based on an extensive review of previous behavioral work (Ashby and Maddox 2005; Chandrasekaran et al. 2014b; Maddox et al. 2013, 2014; Maddox and Chandrasekaran 2014), we predicted that speech categories would be optimally learned via corticostriatal circuitry involved in reflexive learning (Seger 2008; Seger and Miller 2010). In this study, computational modeling of behavioral response strategies revealed an increase in the use of reflexive strategy, and a decrease in the use of reflective or random strategy with experience. Reflexive strategy use was associated with increased activation in the putamen during feedback processing. Final block categorization accuracy was associated with increased stimulus-related activation in the auditory areas that have been previously implicated in speech category learning. These include Heschl's gyrus (Wong et al. 2008), inferior parietal lobule (Gandour, Dzemidzic et al. 2003, Gandour, Wong et al. 2003), and the insular cortex (Wong et al. 2004). Furthermore, individual learning success was associated with activation in the putamen and the motor cortex. These areas have not been directly implicated in speech processing, but are considered to be key components of the corticostriatal motor loop that forms the reflexive category learning system (Seger 2008; Seger and Miller 2010). These behavioral, computational modeling and neuroimaging results help specify the mechanisms underlying feedback-dependent error reduction during speech learning. While speech learning has been mostly viewed as a perceptually encapsulated process in previous research, our findings represent an important conceptual advance in understanding the neurobiological basis of domain-general learning systems during speech processing.

Neural Circuitry Involved in Processing Positive Feedback

Positive feedback, relative to negative feedback, activated several functional loops within the corticostriatal system. These included the ventral striatum, a part of the motivational loop, which is critical in processing the reward value during corrective feedback (Seger 2008; Seger and Miller 2010). These results are consistent with previous work showing that the ventral striatum was more active during positive than negative feedback (Seger et al. 2010). In addition, the DLPFC, anterior cingulate, and the putamen were more active during positive feedback. The DLPFC and the anterior cingulate are key components of the reflective executive loop, which is involved in the explicit processing of trial feedback (Seger 2008; Seger and Miller 2010). These areas have been found to be more active on correct categorization trials during visual learning (Seger et al. 2010). The DLPFC is hypothesized to generate and store verbalizable rules, which are either retained or discarded by the anterior cingulate cortex depending on the valence of the feedback (Ashby and Alfonso-Reese 1998; Ashby and Ell 2001; Maddox et al. 2003; Ashby and Maddox 2005).

In addition to the DLPFC and the anterior cingulate cortex, which are parts of the reflective learning system, positive feedback also increased the activation in the putamen. The putamen, considered a part of the reflexive corticostriatal motor loop (Seger 2008; Seger and Miller 2010), is involved in the selection of appropriate motor responses based on prior experience. The putamen is therefore posited to be involved in procedural learning. Studies have shown that changing the button-to-category associations interfere with reflexive learning but not with reflective learning (Ashby et al. 2003; Maddox et al. 2004, 2010; Spiering and Ashby 2008). Indeed, a recent neuroimaging study suggested that the putamen is integral to reflexive learning of visual categories (Waldschmidt and Ashby 2011). The involvement of the motivational, executive (reflective), and motor (reflexive) loops is consistent with the predictions from the visual category learning literature. Overall, these results demonstrate a functional role for domain-general corticostriatal category learning systems in speech learning. During feedback processing, the ventral striatum responds to the reward value in positive feedback, the DLPFC and the anterior cingulate cortex generate and select rules based on the content of feedback, and the putamen is activated to transform stimuli representations onto procedural responses.

Outside the corticostriatal category learning areas, particularly noteworthy is the activation of the speech-related auditory areas including the left superior temporal sulcus/middle temporal gyrus. Feedback was presented in the visual modality, and the level of sensory auditory stimulation did not vary across positive and negative feedback. Similar positive feedback-driven activation in sensory regions has been previously reported in the visual domain when the feedback was presented in the auditory domain (Weil et al. 2010). The activation of the visual cortex during positive feedback has been interpreted as evidence for the modulation of early sensory regions by the reward processing network. Indeed, we can interpret our results within this framework. The left STS/MTG regions have been shown to be important for auditory speech processing (Hickok and Poeppel, 2007; Rauscheker and Scott, 2009). Activation of these regions during positive feedback may reflect a strengthening of the sensory representation of the rewarded stimulus, driven by the reward processing network. Future work would need to include more trials and effective connectivity analyses to test the possibility of a causal relationship (i.e., the influence of the reward processing network on sensory regions).

Positive feedback also activated the inferior parietal lobule, which is presumed to be an integral part of the phonological network (Hickok and Poeppel 2007). The IPL has been previously conceptualized as a temporary buffer in phonological working memory (Koelsch et al. 2009), especially regarding comparison and decision making (Strand et al. 2008). The auditory input is only available in the form of sensory memory trace during feedback presentation (Sams et al. 1993; Haenschel et al. 2005). Thus, the IPL activation during positive feedback may reflect the mapping of stored representation of the auditory stimulus onto the phonological categories (Buchsbaum and D'Esposito 2008; McGettigan et al. 2011). Since negative feedback does not directly provide stimulus-to-category information, we hypothesize that the IPL is less active during negative feedback condition. Thus positive feedback engages the reward processing network and may provide the critical learning signal for stimulus-to-category mapping within the IPL.

To conclude, positive feedback activates a large corticostriatal network. The reward value of positive feedback is thought to be processed in the ventral striatum. Category learning likely occurs within the reflective (dorsolateral prefrontal cortex and the anterior cingulate cortex) and the reflexive (putamen and the motor cortex) networks (Seger 2008; Seger and Miller 2010). Finally, positive feedback may strengthen the sensory representation of the rewarded stimulus and may promote stimulus-to-category mapping within the phonological network.

Reflexive Strategy Use Associated with Increased Putamen Activation

Individual learners adopt different speech category learning strategies depending on the stage of learning and individual capacity (Maddox and Chandrasekaran 2014). Computational modeling enables direct assessment of this variability in individual response strategies. In this study, response strategies were modeled in each learning block separately for each participant. Multidimensional scaling studies have shown that Mandarin tone categories are most parsimoniously distinguished using 2 pitch dimensions (height and direction; Chandrasekaran et al. 2010). In the current study, the 40 stimuli were embedded in a 2-dimensional space defined by average pitch height and average pitch direction. We hypothesized that the optimal strategy is reflexive and requires a predecisional integration of information across dimensions. Nonoptimal strategies were also explored that were either reflective, relying on only one of the 2 dimensions, or was random. The modeling results revealed that the typical trend was for participants' early learning to be characterized by the use of reflective or random responder strategy and their late learning to be characterized by the use of a reflexive strategy. In addition, the results suggested that reflexive strategy users, as determined on a block-by-block basis, were more accurate in the task. Therefore, learners initially used reflective strategies, but switched to the more optimal reflexive strategies as they gain expertise. This latter interpretation was supported by the mixed effects modeling result which showed that reflexive strategy use was associated with better learning outcomes.

Reflexive category learning is dependent on the mapping of the perceptual experience of the stimulus onto motor response associated with the appropriate category. The cortical sensory input is relayed to the striatum via many-to-one convergent connections, which give rise to a low-resolution stimulus representation (Wilson 1995; Ashby and Ennis 2006). These striatal units allow association of stimuli to category responses, and these corticostriatal connections form the basis of reflexive category learning. The putamen is a strong candidate in this type of plasticity since it exhibits greater connectivity to the auditory association cortices relative to the caudate nucleus (Di Martino et al. 2008). The putamen is also involved in perceptual processing of auditory stimuli (Geiser et al. 2012), and has been implicated in visual category learning research to be critical to reflexive learning (Waldschmidt and Ashby 2011). Our results showed that reflexive strategy use was associated with increased activation in the left putamen during feedback processing. However, no pattern pertaining to the reflexive strategy use could be found in the DLPFC, suggesting that the optimal strategy use is not the result of increased prefrontal reflective processing. The current study, therefore, supports the prediction that speech category learning is reflexive-optimal, and that reflexive strategy involves putamen during feedback processing.

The Role of Corticostriatal Loops in Successful Speech Categorization

During stimulus perception, individual variability in learning performance was associated with the involvement of the corticostriatal motor loop. In the visual category learning literature, the relative dominance of the reflective and reflexive learning systems is dependent on the stage of learning. Early learning is dominated by the executive, reflective learning system (Smith et al. 2012a; 2012b), but later stages of learning are associated with increased automaticity and putamen activation (Haruno and Kawato 2006; Williams and Eskandar 2006; Seger 2009). During reflexive learning, a single striatal “unit,” presumed to be located within the putamen, implicitly associates an abstract cortical–motor response with a large group of sensory cells within the sensory association cortex (Matelli and Luppino 1996; Seger 2008; Seger et al. 2010; Waldschmidt and Ashby 2011). Synaptic plasticity in the striatal cell is facilitated by a dopamine-mediated reinforcement reward signal from positive feedback, which is processed via the motivational loop that connects the ventral striatum and the anterior cingulate cortex (Seger 2008; Seger et al. 2010). In later stages of learning, the dopamine-mediated reinforcement signal becomes more consistent, allowing a stronger association between the stimulus and an accurate category label. As discussed earlier, optimal speech category learning is thought to be reflexive, given its multidimensionality and high variability. Therefore, optimal speech category learning necessitates a switch to the reflexive strategy (Chandrasekaran et al. 2014b), which is likely based on the activity of the loop between the putamen and the motor cortex. This prediction was reflected in the finding that the individual variability in learning performance was associated with increased involvement of the putamen and the motor cortex for correct trials relative to incorrect trials, during stimulus perception.

The putamen receives convergent input (10 000 to 1) from the cortex, as do other parts of the striatum (Wilson 1995). These cortical afferents originate from the prefrontal cortex in the rostral putamen (Selemon and Goldman-Rakic 1985) and the motor and somatosensory areas (Alexander and DeLong 1985) and the superior temporal auditory areas (Yeterian and Pandya 1998) in the caudal putamen. The putamen has been purported to be involved in episodic memory, cognitive control, and category learning, in addition to motor processing. In fact, the putamen has been suggested to be the ideal site of acquisition of stimulus-to-response associations, where sensory stimuli are mapped onto context-specific motor activity that lead to favorable outcomes (Ell et al. 2011). Despite the many functions of the putamen, in the context of the current study, the putamen activation patterns are best interpreted as reflecting reflexive strategy use, as reflexive learning of the speech categories necessitate implicit associations between the speech sounds and behavioral category responses. The findings from this experiment indicate that increased putamen activation was associated with reflexive strategy use, as well as learning success in the final block. However, caution should be taken in inferring process from the activation of the putamen to a specific task function. Further studies are required to confirm the role of the putamen in reflexive learning of speech categories. The various functions of the putamen relate to different anatomical regions of this structure (Ell et al. 2011). High-resolution mapping of the putamen may help clarify the specific role of the putamen in speech learning.


Category learning plays a vital function in human cognition. Speech category learning in adulthood is difficult, but feedback-dependent training can lead to successful speech categorization. In this study we examined the computational and neural mechanics underlying feedback-dependent speech categorization, using a dual-systems approach developed in the visual domain. Considering the complexity of speech categories, it was hypothesized that optimal speech category learning would be associated with the reflexive system. Computational modeling results revealed that the learners were initially biased towards the reflective system, but gradually discarded it in favor of the reflexive system. Throughout learning, reflexive strategy use was associated with better learning performance. Positive feedback was associated with increased activation in reflective and reflexive circuitries. In addition, positive feedback also activated the ventral striatum, a key component of the motivational loop, as well as several regions associated with auditory and speech processing. Furthermore, reflexive strategy use was associated with increased activation in the putamen, which is part of the motor loop that implicitly maps stimuli onto category responses in the motor cortex. Finally, increased activation of this motor loop during stimulus perception was associated with more accurate categorization. The neurocomputational and individual differences approach reveal that successful speech category learning is critically dependent on domain-general corticostriatal learning systems.


Research reported in this publication was supported by the National Institute On Deafness And Other Communication Disorders of the National Institutes of Health under Award Number R01DC013315 834 (BC), and by the National Institute on Drug Abuse under Award Number DA032457 (WTM).

Supplementary Material

Supplementary Data:


The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of Interest: None declared.


  • Alexander GE, DeLong MR. 1985. Microstimulation of the primate neostriatum. II. Somatotopic organization of striatal microexcitable zones and their relation to neuronal response properties. J Neurophysiol. 53:1417–1430. [PubMed]
  • Anderson MC, Ochsner KN, Kuhl B, Cooper J, Robertson E, Gabrieli SW, Glover GH, Gabrieli JD. 2004. Neural systems underlying the suppression of unwanted memories. Science. 303:232–235. [PubMed]
  • Anderson MJ, Robinson J. 2001. Permutation tests for linear models. Aust NZ J Stat. 43:75–88.
  • Arnauld E, Jeantet Y, Arsaut J, Demotes-Mainard J. 1996. Involvement of the caudal striatum in auditory processing: c-fos response to cortical application of picrotoxin and to auditory stimulation. Mol Brain Res. 41:27–35. [PubMed]
  • Ashby FG. 1992a. Multidimensional models of categorization. In: Ashby FG, editor. , editor. Scientific psychology series. Hillsdale, UK: Lawrence Erlbaum Associates; p. 449–483.
  • Ashby FG. 1992b. Multivariate probability distributions. In: Ashby FG, editor. , editor. Multidimensional models of perception and cognition. Hillsdale, UK: Lawrence Erlbaum Associates; p. 1–34.
  • Ashby FG, Alfonso-Reese LA. 1998. A neuropsychological theory of multiple systems in category learning. Psychol Rev. 105:442–481. [PubMed]
  • Ashby FG, Ell SW. 2001. The neurobiology of human category learning. Trends Cogn Sci. 5:204–210. [PubMed]
  • Ashby FG, Ell SW, Waldron EM. 2003. Procedural learning in perceptual categorization. Mem Cognition. 31:1114–1125. [PubMed]
  • Ashby FG, Ennis JM. 2006. The role of the basal ganglia in category learning. Psychol Learn Motiv. 46:1–36.
  • Ashby FG, Maddox WT. 2005. Human category learning. Annu Rev Psychol. 56:149–178. [PubMed]
  • Ashby FG, Maddox WT, Lee WW. 1994. On the dangers of averaging across subjects when using multidimensional scaling or the similarity-choice model. Psychol Sci. 5:144–151.
  • Bates D, Maechler M, Bolker B. 2012. lme4: Linear mixed-effects models using S4 classes. [Computer software].
  • Birn RM, Cox RW, Bandettini PA. 2002. Detection versus estimation in event-related fMRI: choosing the optimal stimulus timing. NeuroImage. 15:252–264. [PubMed]
  • Buchsbaum BR, D'Esposito M. 2008. The search for the phonological store: from loop to convolution. J Cogn Neurosci. 20:762–778. [PubMed]
  • Bullmore ET, Suckling J, Overmeyer S, Rabe-Hesketh S, Taylor E, Brammer MJ. 1999. Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of structural MR images of the brain. IEEE Trans Med Imaging. 18:32–42. [PubMed]
  • Chandrasekaran B, Gandour JT, Krishnan A. 2007. Neuroplasticity in the processing of pitch dimensions: a multidimensional scaling analysis of the mismatch negativity. Restor Neurol Neuros. 25:195–210. [PMC free article] [PubMed]
  • Chandrasekaran B, Koslov SR, Maddox T. 2014a. Toward a dual-learning systems model of speech category Learning. Front Psychol. 5:825. [PMC free article] [PubMed]
  • Chandrasekaran B, Yi H, Maddox WT. 2014b. Dual-learning systems during speech category learning. Psychon B Rev. 21:488–495. [PMC free article] [PubMed]
  • Chandrasekaran B, Sampath PD, Wong PC. 2010. Individual variability in cue-weighting and lexical tone learning. J Acoust Soc Am. 128:456–465. [PubMed]
  • Curtis CE, D'Esposito M. 2003. Persistent activity in the prefrontal cortex during working memory. Trends Cogn Sci. 7:415–423. [PubMed]
  • Dale AM. 1999. Optimal experimental design for event-related fMRI. Hum Brain Mapp. 8:109–114. [PubMed]
  • Desikan RS, Ségonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT. 2006. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage. 31:968–980. [PubMed]
  • Di Martino A, Scheres A, Margulies D, Kelly A, Uddin LQ, Shehzad Z, Biswal B, Walters JR, Castellanos FX, Milham MP. 2008. Functional connectivity of human striatum: a resting state FMRI study. Cereb Cortex. 18:2735–2747. [PubMed]
  • Ell SW, Helie S, Hutchinson S. 2011. Contributions of the putamen to cognitive function. In: Costa A, Villalba E, editors. , editors. Horizons in neuroscience research. 7th ed Hauppauge, NY: Nova Science Publishers; p. 29–52.
  • Estes WK. 1956. The problem of inference from curves based on group data. Psychol Bull. 53:134. [PubMed]
  • Frazier JA, Chiu S, Breeze JL, Makris N, Lange N, Kennedy DN, Herbert MR, Bent EK, Koneru VK, Dieterich ME. 2005. Structural brain magnetic resonance imaging of limbic and thalamic volumes in pediatric bipolar disorder. Am J Psychiat. 162:1256–1265. [PubMed]
  • Freedman D, Lane D. 1983. A nonstochastic interpretation of reported significance levels. J Bus Econ Stat. 1:292–298.
  • Gandour J, Dzemidzic M, Wong D, Lowe M, Tong Y, Hsieh L, Satthamnuwong N, Lurito J. 2003. Temporal integration of speech prosody is shaped by language experience: An fMRI study. Brain Lang. 84:318–336. [PubMed]
  • Gandour J, Wong D, Dzemidzic M, Lowe M, Tong Y, Li X. 2003. A cross-linguistic fMRI study of perception of intonation and emotion in Chinese. Hum Brain Mapp. 18:149–157. [PubMed]
  • Garners W. 1974. The processing of information and structure. Hillsdale, UK: Lawrence Erlbaum Associates.
  • Geiser E, Notter M, Gabrieli JD. 2012. A corticostriatal neural system enhances auditory perception through temporal context processing. J Neurosci. 32:6177–6182. [PubMed]
  • Goldstein JM, Seidman LJ, Makris N, Ahern T, O'Brien LM, Caviness VS, Jr, Kennedy DN, Faraone SV, Tsuang MT. 2007. Hypothalamic abnormalities in schizophrenia: sex effects and genetic vulnerability. Biol Psychiat. 61:935–945. [PubMed]
  • Goudbeek M, Cutler A, Smits R. 2008. Supervised and unsupervised learning of multidimensionally varying non-native speech categories. Speech Commun. 50:109–125.
  • Grabner G, Janke AL, Budge MM, Smith D, Pruessner J, Collins DL. 2006. Symmetric atlasing and model based segmentation: an application to the hippocampus in older adults. In: Larsen R, Nielsen M, Sporring J, editors. , editors. Medical image computing and computer-assisted intervention—MICCAI 2006. Berlin, DE: Springer; p. 58–66. [PubMed]
  • Haenschel C, Vernon DJ, Dwivedi P, Gruzelier JH, Baldeweg T. 2005. Event-related brain potential correlates of human auditory sensory memory-trace formation. J Neurosci. 25:10494–10501. [PubMed]
  • Haruno M, Kawato M. 2006. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. J Neurophysiol. 95:948–959. [PubMed]
  • Hayasaka S, Nichols TE. 2003. Validating cluster size inference: random field and permutation methods. NeuroImage. 20:2343–2356. [PubMed]
  • Hickok G, Poeppel D. 2007. The cortical organization of speech processing. Nat Neurosci. 8:393–402. [PubMed]
  • Hikosaka O, Sakamoto M, Usui S. 1989. Functional properties of monkey caudate neurons. III. Activities related to expectation of target and reward. J Neurophysiol. 61:814–832. [PubMed]
  • Hillenbrand J, Getty LA, Clark MJ, Wheeler K. 1995. Acoustic characteristics of American English vowels. J Acoust Soc Am. 97:3099–3111. [PubMed]
  • Holt LL, Lotto AJ. 2008. Speech perception within an auditory cognitive science framework. Curr Dir Psychol Sci. 17:42–46. [PMC free article] [PubMed]
  • Holt LL, Lotto AJ. 2010. Speech perception as categorization. Atten Percept Psychophys. 72:1218–1227. [PMC free article] [PubMed]
  • Iverson P, Kuhl PK, Akahane-Yamada R, Diesch E, Tohkura Y, Kettermann A, Siebert C. 2003. A perceptual interference account of acquisition difficulties for non-native phonemes. Cognition. 87:B47–B57. [PubMed]
  • Jenkinson M, Bannister P, Brady M, Smith S. 2002. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage. 17:825–841. [PubMed]
  • Jenkinson M, Beckmann CF, Behrens TE, Woolrich MW, Smith SM. 2012. FSL. NeuroImage. 62:782–790. [PubMed]
  • Jenkinson M, Pechaud M, Smith S. 2005. BET2: MR-based estimation of brain, skull and scalp surfaces. In: Eleventh Annual Meeting of the Organization for Human Brain Mapping.
  • Jenkinson M, Smith S. 2001. A global optimisation method for robust affine registration of brain images. Med Image Anal. 5:143–156. [PubMed]
  • Jongman A, Wayland R, Wong S. 2000. Acoustic characteristics of English fricatives. J Acoust Soc Am. 108:1252–1263. [PubMed]
  • Kennedy FE. 1995. Randomization tests in econometrics. J Bus Econ Stat. 13:85–94.
  • Koelsch S, Schulze K, Sammler D, Fritz T, Müller K, Gruber O. 2009. Functional architecture of verbal and tonal working memory: an FMRI study. Hum Brain Mapp. 30:859–873. [PubMed]
  • Lisker L. 1986. “Voicing” in English: a catalogue of acoustic features signaling /b/ versus /p/ in trochees. Lang Speech. 29:3–11. [PubMed]
  • Liu TT, Frank LR, Wong EC, Buxton RB. 2001. Detection power, estimation efficiency, and predictability in event-related fMRI. NeuroImage. 13:759–773. [PubMed]
  • Love BC, Medin DL, Gureckis TM. 2004. SUSTAIN: a network model of category learning. Psychol Rev. 111:309. [PubMed]
  • Maddox WT. 1999. On the dangers of averaging across observers when comparing decision bound models and generalized context models of categorization. Percept Psychophys. 61:354–374. [PubMed]
  • Maddox WT. 2002. Learning and attention in multidimensional identification and categorization: separating low-level perceptual processes and high-level decisional processes. J Exp Psychol Learn. 28:99–115. [PubMed]
  • Maddox WT, Ashby FG. 2004. Dissociating explicit and procedural-learning based systems of perceptual category learning. Behav Process. 66:309–332. [PubMed]
  • Maddox WT, Ashby FG, Bohil CJ. 2003. Delayed feedback effects on rule-based and information-integration category learning. J Exp Psychol Learn. 29:650. [PubMed]
  • Maddox WT, Ashby FG, Ing AD, Pickering AD. 2004. Disrupting feedback processing interferes with rule-based but not information-integration category learning. Mem Cogn. 32:582–591. [PubMed]
  • Maddox WT, Chandrasekaran B. 2014. Tests of a dual-systems model of speech category learning. Biling Lang Cogn. FirstView:1–20. [PMC free article] [PubMed]
  • Maddox WT, Chandrasekaran B, Smayda K, Yi H. 2013. Dual systems of speech category learning across the lifespan. Psychol Aging. 28:1042–1056. [PMC free article] [PubMed]
  • Maddox WT, Chandrasekaran B, Smayda K, Yi H, Koslov S, Beevers CG. 2014. Elevated depressive symptoms enhance reflexive but not reflective auditory category learning. Cortex. 58:186–198. [PMC free article] [PubMed]
  • Maddox WT, Filoteo JV. 2011. Stimulus range and discontinuity effects on information-integration category learning and generalization. Atten Percept Psycho. 73:1279–1295. [PMC free article] [PubMed]
  • Maddox WT, Glass BD, O'Brien JB, Filoteo JV, Ashby FG. 2010. Category label and response location shifts in category learning. Psychol Res. 74:219–236. [PMC free article] [PubMed]
  • Maddox WT, Ing AD, Lauritzen JS. 2006. Stimulus modality interacts with category structure in perceptual category learning. Percept Psychophys. 68:1176–1190. [PubMed]
  • Maddox WT, Love BC, Glass BD, Filoteo JV. 2008. When more is less: feedback effects in perceptual category learning. Cognition. 108:578–589. [PMC free article] [PubMed]
  • Maddox WT, Molis MR, Diehl RL. 2002. Generalizing a neuropsychological model of visual categorization to auditory categorization of vowels. Percept Psychophys. 64:584–597. [PubMed]
  • Makris N, Goldstein JM, Kennedy D, Hodge SM, Caviness VS, Faraone SV, Tsuang MT, Seidman LJ. 2006. Decreased volume of left and total anterior insular lobule in schizophrenia. Schizophr Res. 83:155–171. [PubMed]
  • Marian V, Blumenfeld HK, Kaushanskaya M. 2007. The Language Experience and Proficiency Questionnaire (LEAP-Q): assessing language profiles in bilinguals and multilinguals. J Speech Lang Hear R. 50:940. [PubMed]
  • Matelli M, Luppino G. 1996. Thalamic input to mesial and superior area 6 in the macaque monkey. J Comp Neurol. 372:59–87. [PubMed]
  • McCandliss BD, Fiez JA, Protopapas A, Conway M, McClelland JL. 2002. Success and failure in teaching the [r]-[l] contrast to Japanese adults: tests of a Hebbian model of plasticity and stabilization in spoken language perception. Cogn Affect Behav Ne. 2:89–108. [PubMed]
  • McClelland JL, Patterson K. 2002. Rules or connections in past-tense inflections: what does the evidence rule out? Trends Cogn Sci. 6:465–472. [PubMed]
  • McGettigan C, Warren JE, Eisner F, Marshall CR, Shanmugalingam P, Scott SK. 2011. Neural correlates of sublexical processing in phonological working memory. J Cogn Neurosci. 23:961–977. [PMC free article] [PubMed]
  • Mumford J. 2007. A guide to calculating percent change with Featquery. Unpublished Tech Report. Available from: URL
  • Nichols TE, Holmes AP. 2002. Nonparametric permutation tests for functional neuroimaging: a primer with examples. Hum Brain Mapp. 15:1–25. [PubMed]
  • Nomura E, Maddox W, Filoteo J, Ing A, Gitelman D, Parrish T, Mesulam M, Reber P. 2007. Neural correlates of rule-based and information-integration visual category learning. Cereb Cortex. 17:37–43. [PubMed]
  • Norris D, McQueen JM, Cutler A. 2003. Perceptual learning in speech. Cogn Psychol. 47:204–238. [PubMed]
  • Perrachione TK, Lee J, Ha LY, Wong PC. 2011. Learning a novel phonological contrast depends on interactions between individual differences and training paradigm design. J Acoust Soc Am. 130:461. [PubMed]
  • Pochon J, Levy R, Fossati P, Lehericy S, Poline J, Pillon B, Le Bihan D, Dubois B. 2002. The neural system that bridges reward and cognition in humans: an fMRI study. Proc Nat Acad Sci. 99:5669–5674. [PubMed]
  • Poldrack RA, Packard MG. 2003. Competition among multiple memory systems: converging evidence from animal and human brain studies. Neuropsychologia. 41:245–251. [PubMed]
  • Rauschecker JP, Scott SK. 2009. Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing. Nat Neurosci. 12:718–724. [PMC free article] [PubMed]
  • Reale R, Imig T. 1983. Auditory cortical field projections to the basal ganglia of the cat. Neuroscience. 8:67–86. [PubMed]
  • Romanski LM, Tian B, Fritz J, Mishkin M, Goldman-Rakic PS, Rauschecker JP. 1999. Dual streams of auditory afferents target multiple domains in the primate prefrontal cortex. Nat Neurosci. 2:1131–1136. [PMC free article] [PubMed]
  • Rorden C. 2007. MRIcron [Computer software].
  • Sams M, Hari R, Rif J, Knuutila J. 1993. The human auditory sensory memory trace persists about 10 sec: neuromagnetic evidence. J Cogn Neurosci. 5:363–370. [PubMed]
  • Seger CA. 2008. How do the basal ganglia contribute to categorization? Their roles in generalization, response selection, and learning via feedback. Neurosci Biobehav Rev. 32:265–278. [PMC free article] [PubMed]
  • Seger CA. 2009. The involvement of corticostriatal loops in learning across tasks, species, and methodologies. In: Groenewegen HJ, Voorn P, Berendse HW, Mulder AB, Cools AR, editors. , editors. The basal ganglia IX. Berlin, DE: Springer; p. 25–39.
  • Seger CA, Cincotta CM. 2005. The roles of the caudate nucleus in human classification learning. J Neurosci. 25:2941–2951. [PubMed]
  • Seger CA, Miller EK. 2010. Category learning in the brain. Ann NY Acad Sci. 33:203–219. [PMC free article] [PubMed]
  • Seger CA, Peterson EJ, Cincotta CM, Lopez-Paniagua D, Anderson CW. 2010. Dissociating the contributions of independent corticostriatal systems to visual categorization learning through the use of reinforcement learning modeling and Granger causality modeling. NeuroImage. 50:644–656. [PMC free article] [PubMed]
  • Selemon L, Goldman-Rakic P. 1985. Longitudinal topography and interdigitation of corticostriatal projections in the rhesus monkey. J Neurosci. 5:776–794. [PubMed]
  • Shepard RN. 1964. Attention and the metric structure of the stimulus space. J Math Psychol. 1:54–87.
  • Smith JD, Berg ME, Cook RG, Murphy MS, Crossley MJ, Boomer J, Spiering B, Beran MJ, Church BA, Ashby FG. 2012a. Implicit and explicit categorization: a tale of four species. Neurosci Biobehav R 36:2355–2369. [PMC free article] [PubMed]
  • Smith JD, Crossley MJ, Boomer J, Church BA, Beran MJ, Ashby FG. 2012b. Implicit and explicit category learning by capuchin monkeys (Cebus apella). J Comp Psychol. 126:294–304. [PMC free article] [PubMed]
  • Smith SM. 2002. Fast robust automated brain extraction. Hum Brain Mapp. 17:143–155. [PubMed]
  • Smith SM, Jenkinson M, Woolrich MW, Beckmann CF, Behrens TE, Johansen-Berg H, Bannister PR, De Luca M, Drobnjak I, Flitney DE. 2004. Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage. 23:S208–S219. [PubMed]
  • Spence SA, Crimlisk HL, Cope H, Ron MA, Grasby PM. 2000. Discrete neurophysiological correlates in prefrontal cortex during hysterical and feigned disorder of movement. Lancet. 355:1243–1244. [PubMed]
  • Spiering BJ, Ashby FG. 2008. Response processes in information–integration category learning. Neurobiol Learn Mem. 90:330–338. [PMC free article] [PubMed]
  • Strand F, Forssberg H, Klingberg T, Norrelgen F. 2008. Phonological working memory with auditory presentation of pseudo-words—an event related fMRI Study. Brain Res. 1212:48–54. [PubMed]
  • Vallabha GK, McClelland JL, Pons F, Werker JF, Amano S. 2007. Unsupervised learning of vowel categories from infant-directed speech. Proc Natl A Sci. 104:13273–13278. [PubMed]
  • Wagenmakers E-J, Farrell S. 2004. AIC model selection using Akaike weights. Psychon B Rev. 11:192–196. [PubMed]
  • Waldschmidt JG, Ashby FG. 2011. Cortical and striatal contributions to automaticity in information-integration categorization. NeuroImage. 56:1791–1802. [PMC free article] [PubMed]
  • Weil RS, Furl N, Ruff CC, Symmonds M, Flandin G, Dolan RJ, Driver J, Rees G. 2010. Rewarding feedback after correct visual discriminations has both general and specific influences on visual cortex. J Neurophysiol. 104:1746–1757. [PMC free article] [PubMed]
  • Wickens TD. 1982. Models for behavior: stochastic processes in psychology. San Francisco, CA: WH Freeman.
  • Williams ZM, Eskandar EN. 2006. Selective enhancement of associative learning by microstimulation of the anterior caudate. Nat Neurosci. 9:562–568. [PubMed]
  • Wilson CJ. 1995. The contribution of cortical neurons to the firing pattern of striatal spiny neurons. In: Houk JC, Davis JL, Beiser DG, editors. , editors. Models of information processing in the basal ganglia. Cambridge, MA: MIT Press; p. 29–50.
  • Wong PC, Parsons LM, Martinez M, Diehl RL. 2004. The role of the insular cortex in pitch pattern perception: the effect of linguistic contexts. J Neurosci. 24:9153–9160. [PubMed]
  • Wong PC, Perrachione TK, Gunasekera G, Chandrasekaran B. 2009. Communication disorders in speakers of tone languages: etiological bases and clinical considerations. Semin Speech Lang. 30:162–173. [PMC free article] [PubMed]
  • Wong PC, Warrier CM, Penhune VB, Roy AK, Sadehh A, Parrish TB, Zatorre RJ. 2008. Volume of left Heschl’s gyrus and linguistic pitch learning. Cereb Cortex. 18:828–836. [PMC free article] [PubMed]
  • Woolrich MW, Jbabdi S, Patenaude B, Chappell M, Makni S, Behrens T, Beckmann C, Jenkinson M, Smith SM. 2009. Bayesian analysis of neuroimaging data in FSL. NeuroImage. 45:S173–S186. [PubMed]
  • Yeterian E, Pandya D. 1998. Corticostriatal connections of the superior temporal region in rhesus monkeys. J Comp Neurol. 399:384–402. [PubMed]

Articles from Cerebral Cortex (New York, NY) are provided here courtesy of Oxford University Press