It sometimes happens that when someone asks a question, the addressee does not give an adequate answer, for instance by leaving out part of the required information. The person who posed the question may wonder why the information was omitted, and engage in extensive processing to find out what the partial answer actually means. The present study looks at the neural correlates of the pragmatic processes invoked by partial answers to questions. Two experiments are presented in which participants read mini-dialogues while their Event-Related brain Potentials (ERPs) are being measured. In both experiments, violating the dependency between questions and answers was found to lead to an increase in the amplitude of the P600 component. We interpret these P600-effects as reflecting the increased effort in creating a coherent representation of what is communicated. This effortful processing might include the computation of what the dialogue participant meant to communicate by withholding information. Our study is one of few investigating language processing in conversation, be it that our participants were ‘eavesdroppers’ instead of real interactants. Our results contribute to the as of yet small range of pragmatic phenomena that modulate the processes underlying the P600 component, and suggest that people immediately attempt to regain cohesion if a question-answer dependency is violated in an ongoing conversation.
Learning the functional properties of objects is a core mechanism in the development of conceptual, cognitive and linguistic knowledge in children. The cerebral processes underlying these learning mechanisms remain unclear in adults and unexplored in children. Here, we investigated the neurophysiological patterns underpinning the learning of functions for novel objects in 10-year-old healthy children. Event-related fields (ERFs) were recorded using magnetoencephalography (MEG) during a picture-definition task. Two MEG sessions were administered, separated by a behavioral verbal learning session during which children learned short definitions about the “magical” function of 50 unknown non-objects. Additionally, 50 familiar real objects and 50 other unknown non-objects for which no functions were taught were presented at both MEG sessions. Children learned at least 75% of the 50 proposed definitions in less than one hour, illustrating children's powerful ability to rapidly map new functional meanings to novel objects. Pre- and post-learning ERFs differences were analyzed first in sensor then in source space. Results in sensor space disclosed a learning-dependent modulation of ERFs for newly learned non-objects, developing 500–800 msec after stimulus onset. Analyses in the source space windowed over this late temporal component of interest disclosed underlying activity in right parietal, bilateral orbito-frontal and right temporal regions. Altogether, our results suggest that learning-related evolution in late ERF components over those regions may support the challenging task of rapidly creating new semantic representations supporting the processing of the meaning and functions of novel objects in children.
Much of what we know regarding the effect of stimulus repetition on neuroelectric adaptation comes from studies using artificially produced pure tones or harmonic complex sounds. Little is known about the neural processes associated with the representation of everyday sounds and how these may be affected by aging. In this study, we used real life, meaningful sounds presented at various azimuth positions and found that auditory evoked responses peaking at about 100 and 180 ms after sound onset decreased in amplitude with stimulus repetition. This neural adaptation was greater in young than in older adults and was more pronounced when the same sound was repeated at the same location. Moreover, the P2 waves showed differential patterns of domain-specific adaptation when location and identity was repeated among young adults. Background noise decreased ERP amplitudes and modulated the magnitude of repetition effects on both the N1 and P2 amplitude, and the effects were comparable in young and older adults. These findings reveal an age-related difference in the neural processes associated with adaptation to meaningful sounds, which may relate to older adults’ difficulty in ignoring task-irrelevant stimuli.
The audibility of a target tone in a multitone background masker is enhanced by the presentation of a precursor sound consisting of the masker alone. There is evidence that precursor-induced neural adaptation plays a role in this perceptual enhancement. However, the precursor may also be strategically used by listeners as a spectral template of the following masker to better segregate it from the target. In the present study, we tested this hypothesis by measuring the audibility of a target tone in a multitone masker after the presentation of precursors which, in some conditions, were made dissimilar to the masker by gating their components asynchronously. The precursor and the following sound were presented either to the same ear or to opposite ears. In either case, we found no significant difference in the amount of enhancement produced by synchronous and asynchronous precursors. In a second experiment, listeners had to judge whether a synchronous multitone complex contained exactly the same tones as a preceding precursor complex or had one tone less. In this experiment, listeners performed significantly better with synchronous than with asynchronous precursors, showing that asynchronous precursors were poorer perceptual templates of the synchronous multitone complexes. Overall, our findings indicate that precursor-induced auditory enhancement cannot be fully explained by the strategic use of the precursor as a template of the following masker. Our results are consistent with an explanation of enhancement based on selective neural adaptation taking place at a central locus of the auditory system.
Humans can anticipate and prepare for uncertainties to achieve a goal. However, it is difficult to maintain this effort over a prolonged period of time. Inappropriate behavior is impulsively (or mindlessly) activated by an external trigger, which can result in serious consequences such as traffic crashes. Thus, we examined the neural mechanisms underlying such impulsive responding using functional magnetic resonance imaging (fMRI). Twenty-two participants performed a block-designed sustained attention to response task (SART), where each task block was composed of consecutive Go trials followed by a NoGo trial at the end. This task configuration enabled us to measure compromised preparation for NoGo trials during Go responses using reduced Go reaction times. Accordingly, parametric modulation analysis was conducted on fMRI data using block-based mean Go reaction times as an online marker of impulsive responding in the SART. We found that activity in the right dorsolateral prefrontal cortex (DLPFC) and the bilateral intraparietal sulcus (IPS) was positively modulated with mean Go reaction times. In addition, activity in the medial prefrontal cortex (MPFC) and the posterior cingulate cortex (PCC) was negatively modulated with mean Go reaction times, albeit statistically weakly. Taken together, spontaneously reduced activity in the right DLPFC and the IPS and spontaneously elevated activity in the MPFC and the PCC were associated with impulsive responding in the SART. These results suggest that such a spontaneous transition of brain activity pattern results in impulsive responding in monotonous situations, which in turn, might cause human errors in actual work environments.
Research on the neural basis of speech-reading implicates a network of auditory language regions involving inferior frontal cortex, premotor cortex and sites along superior temporal cortex. In audiovisual speech studies, neural activity is consistently reported in posterior superior temporal Sulcus (pSTS) and this site has been implicated in multimodal integration. Traditionally, multisensory interactions are considered high-level processing that engages heteromodal association cortices (such as STS). Recent work, however, challenges this notion and suggests that multisensory interactions may occur in low-level unimodal sensory cortices. While previous audiovisual speech studies demonstrate that high-level multisensory interactions occur in pSTS, what remains unclear is how early in the processing hierarchy these multisensory interactions may occur. The goal of the present fMRI experiment is to investigate how visual speech can influence activity in auditory cortex above and beyond its response to auditory speech. In an audiovisual speech experiment, subjects were presented with auditory speech with and without congruent visual input. Holding the auditory stimulus constant across the experiment, we investigated how the addition of visual speech influences activity in auditory cortex. We demonstrate that congruent visual speech increases the activity in auditory cortex.
Earlier studies have shown considerable intersubject synchronization of brain activity when subjects watch the same movie or listen to the same story. Here we investigated the across-subjects similarity of brain responses to speech and non-speech sounds in a continuous audio drama designed for blind people. Thirteen healthy adults listened for ∼19 min to the audio drama while their brain activity was measured with 3 T functional magnetic resonance imaging (fMRI). An intersubject-correlation (ISC) map, computed across the whole experiment to assess the stimulus-driven extrinsic brain network, indicated statistically significant ISC in temporal, frontal and parietal cortices, cingulate cortex, and amygdala. Group-level independent component (IC) analysis was used to parcel out the brain signals into functionally coupled networks, and the dependence of the ICs on external stimuli was tested by comparing them with the ISC map. This procedure revealed four extrinsic ICs of which two–covering non-overlapping areas of the auditory cortex–were modulated by both speech and non-speech sounds. The two other extrinsic ICs, one left-hemisphere-lateralized and the other right-hemisphere-lateralized, were speech-related and comprised the superior and middle temporal gyri, temporal poles, and the left angular and inferior orbital gyri. In areas of low ISC four ICs that were defined intrinsic fluctuated similarly as the time-courses of either the speech-sound-related or all-sounds-related extrinsic ICs. These ICs included the superior temporal gyrus, the anterior insula, and the frontal, parietal and midline occipital cortices. Taken together, substantial intersubject synchronization of cortical activity was observed in subjects listening to an audio drama, with results suggesting that speech is processed in two separate networks, one dedicated to the processing of speech sounds and the other to both speech and non-speech sounds.
Voice, as a secondary sexual characteristic, is known to affect the perceived attractiveness of human individuals. But the underlying mechanism of vocal attractiveness has remained unclear. Here, we presented human listeners with acoustically altered natural sentences and fully synthetic sentences with systematically manipulated pitch, formants and voice quality based on a principle of body size projection reported for animal calls and emotional human vocal expressions. The results show that male listeners preferred a female voice that signals a small body size, with relatively high pitch, wide formant dispersion and breathy voice, while female listeners preferred a male voice that signals a large body size with low pitch and narrow formant dispersion. Interestingly, however, male vocal attractiveness was also enhanced by breathiness, which presumably softened the aggressiveness associated with a large body size. These results, together with the additional finding that the same vocal dimensions also affect emotion judgment, indicate that humans still employ a vocal interaction strategy used in animal calls despite the development of complex language.
Behavioral studies of spoken word memory have shown that context congruency facilitates both word and source recognition, though the level at which context exerts its influence remains equivocal. We measured event-related potentials (ERPs) while participants performed both types of recognition task with words spoken in four voices. Two voice parameters (i.e., gender and accent) varied between speakers, with the possibility that none, one or two of these parameters was congruent between study and test. Results indicated that reinstating the study voice at test facilitated both word and source recognition, compared to similar or no context congruency at test. Behavioral effects were paralleled by two ERP modulations. First, in the word recognition test, the left parietal old/new effect showed a positive deflection reflective of context congruency between study and test words. Namely, the same speaker condition provided the most positive deflection of all correctly identified old words. In the source recognition test, a right frontal positivity was found for the same speaker condition compared to the different speaker conditions, regardless of response success. Taken together, the results of this study suggest that the benefit of context congruency is reflected behaviorally and in ERP modulations traditionally associated with recognition memory.
In Japanese, vowel duration can distinguish the meaning of words. In order for infants to learn this phonemic contrast using simple distributional analyses, there should be reliable differences in the duration of short and long vowels, and the frequency distribution of vowels must make these differences salient enough in the input. In this study, we evaluate these requirements of phonemic learning by analyzing the duration of vowels from over 11 hours of Japanese infant-directed speech. We found that long vowels are substantially longer than short vowels in the input directed to infants, for each of the five oral vowels. However, we also found that learning phonemic length from the overall distribution of vowel duration is not going to be easy for a simple distributional learner, because of the large base-rate effect (i.e., 94% of vowels are short), and because of the many factors that influence vowel duration (e.g., intonational phrase boundaries, word boundaries, and vowel height). Therefore, a successful learner would need to take into account additional factors such as prosodic and lexical cues in order to discover that duration can contrast the meaning of words in Japanese. These findings highlight the importance of taking into account the naturalistic distributions of lexicons and acoustic cues when modeling early phonemic learning.
The present study evaluated the relation between speech perception in the presence of background noise and temporal processing ability in listeners with Auditory Neuropathy (AN).
The study included two experiments. In the first experiment, temporal resolution of listeners with normal hearing and those with AN was evaluated using measures of temporal modulation transfer function and frequency modulation detection at modulation rates of 2 and 10 Hz. In the second experiment, speech perception in quiet and noise was evaluated at three signal to noise ratios (SNR) (0, 5, and 10 dB).
Results demonstrated that listeners with AN performed significantly poorer than normal hearing listeners in both amplitude modulation and frequency modulation detection, indicating significant impairment in extracting envelope as well as fine structure cues from the signal. Furthermore, there was significant correlation seen between measures of temporal resolution and speech perception in noise.
Results suggested that an impaired ability to efficiently process envelope and fine structure cues of the speech signal may be the cause of the extreme difficulties faced during speech perception in noise by listeners with AN.
The organization of sound into meaningful units is fundamental to the processing of auditory information such as speech and music. In expressive music performance, structural units or phrases may become particularly distinguishable through subtle timing variations highlighting musical phrase boundaries. As such, expressive timing may support the successful parsing of otherwise continuous musical material. By means of the event-related potential technique (ERP), we investigated whether expressive timing modulates the neural processing of musical phrases. Musicians and laymen listened to short atonal scale-like melodies that were presented either isochronously (deadpan) or with expressive timing cues emphasizing the melodies’ two-phrase structure. Melodies were presented in an active and a passive condition. Expressive timing facilitated the processing of phrase boundaries as indicated by decreased N2b amplitude and enhanced P3a amplitude for target phrase boundaries and larger P2 amplitude for non-target boundaries. When timing cues were lacking, task demands increased especially for laymen as reflected by reduced P3a amplitude. In line, the N2b occurred earlier for musicians in both conditions indicating general faster target detection compared to laymen. Importantly, the elicitation of a P3a-like response to phrase boundaries marked by a pitch leap during passive exposure suggests that expressive timing information is automatically encoded and may lead to an involuntary allocation of attention towards significant events within a melody. We conclude that subtle timing variations in music performance prepare the listener for musical key events by directing and guiding attention towards their occurrences. That is, expressive timing facilitates the structuring and parsing of continuous musical material even when the auditory input is unattended.
This study investigated a theoretically challenging dissociation between good production and poor perception of tones among neurologically unimpaired native speakers of Cantonese. The dissociation is referred to as the near-merger phenomenon in sociolinguistic studies of sound change. In a passive oddball paradigm, lexical and nonlexical syllables of the T1/T6 and T4/T6 contrasts were presented to elicit the mismatch negativity (MMN) and P3a from two groups of participants, those who could produce and distinguish all tones in the language (Control) and those who could produce all tones but specifically failed to distinguish between T4 and T6 in perception (Dissociation). The presence of MMN to T1/T6 and null response to T4/T6 of lexical syllables in the dissociation group confirmed the near-merger phenomenon. The observation that the control participants exhibited a statistically reliable MMN to lexical syllables of T1/T6, weaker responses to nonlexical syllables of T1/T6 and lexical syllables of T4/T6, and finally null response to nonlexical syllables of T4/T6, suggests the involvement of top-down processing in speech perception. Furthermore, the stronger P3a response of the control group, compared with the dissociation group in the same experimental conditions, may be taken to indicate higher cognitive capability in attention switching, auditory attention or memory in the control participants. This cognitive difference, together with our speculation that constant top-down predictions without complete bottom-up analysis of acoustic signals in speech recognition may reduce one’s sensitivity to small acoustic contrasts, account for the occurrence of dissociation in some individuals but not others.
Computational and experimental research has revealed that auditory sensory predictions are derived from regularities of the current environment by using internal generative models. However, so far, what has not been addressed is how the auditory system handles situations giving rise to redundant or even contradictory predictions derived from different sources of information. To this end, we measured error signals in the event-related brain potentials (ERPs) in response to violations of auditory predictions. Sounds could be predicted on the basis of overall probability, i.e., one sound was presented frequently and another sound rarely. Furthermore, each sound was predicted by an informative visual cue. Participants’ task was to use the cue and to discriminate the two sounds as fast as possible. Violations of the probability based prediction (i.e., a rare sound) as well as violations of the visual-auditory prediction (i.e., an incongruent sound) elicited error signals in the ERPs (Mismatch Negativity [MMN] and Incongruency Response [IR]). Particular error signals were observed even in case the overall probability and the visual symbol predicted different sounds. That is, the auditory system concurrently maintains and tests contradictory predictions. Moreover, if the same sound was predicted, we observed an additive error signal (scalp potential and primary current density) equaling the sum of the specific error signals. Thus, the auditory system maintains and tolerates functionally independently represented redundant and contradictory predictions. We argue that the auditory system exploits all currently active regularities in order to optimally prepare for future events.
Songbirds are one of the few groups of animals that learn the sounds used for vocal communication during development. Like humans, songbirds memorize vocal sounds based on auditory experience with vocalizations of adult “tutors”, and then use auditory feedback of self-produced vocalizations to gradually match their motor output to the memory of tutor sounds. In humans, investigations of early vocal learning have focused mainly on perceptual skills of infants, whereas studies of songbirds have focused on measures of vocal production. In order to fully exploit songbirds as a model for human speech, understand the neural basis of learned vocal behavior, and investigate links between vocal perception and production, studies of songbirds must examine both behavioral measures of perception and neural measures of discrimination during development. Here we used behavioral and electrophysiological assays of the ability of songbirds to distinguish vocal calls of varying frequencies at different stages of vocal learning. The results show that neural tuning in auditory cortex mirrors behavioral improvements in the ability to make perceptual distinctions of vocal calls as birds are engaged in vocal learning. Thus, separate measures of neural discrimination and behavioral perception yielded highly similar trends during the course of vocal development. The timing of this improvement in the ability to distinguish vocal sounds correlates with our previous work showing substantial refinement of axonal connectivity in cortico-basal ganglia pathways necessary for vocal learning.
Top-down attention to spatial and temporal cues has been thoroughly studied in the visual domain. However, because the neural systems that are important for auditory top-down temporal attention (i.e., attention based on time interval cues) remain undefined, the differences in brain activity between directed attention to auditory spatial location (compared with time intervals) are unclear. Using fMRI (magnetic resonance imaging), we measured the activations caused by cue-target paradigms by inducing the visual cueing of attention to an auditory target within a spatial or temporal domain. Imaging results showed that the dorsal frontoparietal network (dFPN), which consists of the bilateral intraparietal sulcus and the frontal eye field, responded to spatial orienting of attention, but activity was absent in the bilateral frontal eye field (FEF) during temporal orienting of attention. Furthermore, the fMRI results indicated that activity in the right ventrolateral prefrontal cortex (VLPFC) was significantly stronger during spatial orienting of attention than during temporal orienting of attention, while the DLPFC showed no significant differences between the two processes. We conclude that the bilateral dFPN and the right VLPFC contribute to auditory spatial orienting of attention. Furthermore, specific activations related to temporal cognition were confirmed within the superior occipital gyrus, tegmentum, motor area, thalamus and putamen.
Little is known about the timing of activating memory for objects and their associated perceptual properties, such as colour, and yet this is important for theories of human cognition. We investigated the time course associated with early cognitive processes related to the activation of object shape and object shape+colour representations respectively, during memory retrieval as assessed by repetition priming in an event-related potential (ERP) study. The main findings were as follows: (1) we identified a unique early modulation of mean ERP amplitude during the N1 that was associated with the activation of object shape independently of colour; (2) we also found a subsequent early P2 modulation of mean amplitude over the same electrode clusters associated with the activation of object shape+colour representations; (3) these findings were apparent across both familiar (i.e., correctly coloured – yellow banana) and novel (i.e., incorrectly coloured - blue strawberry) objects; and (4) neither of the modulations of mean ERP amplitude were evident during the P3. Together the findings delineate the timing of object shape and colour memory systems and support the notion that perceptual representations of object shape mediate the retrieval of temporary shape+colour representations for familiar and novel objects.
Facial emotions and emotional body postures can easily grab attention in social communication. In the context of faces, gaze has been shown as an important cue for orienting attention, but less is known for other important body parts such as hands. In the present study we investigated whether hands may orient attention due to the emotional features they convey. By implying motion in static photographs of hands, we aimed at furnishing observers with information about the intention to act and at testing if this interacted with the hand automatic coding. In this study, we compared neutral and frontal hands to emotionally threatening hands, rotated along their radial-ulnar axes in a Sidedness task (a Simon-like task based on automatic access to body representation). Results showed a Sidedness effect for both the palm and the back views with either neutral and emotional hands. More important, no difference was found between the two views for neutral hands, but it emerged in the case of the emotional hands: faster reaction times were found for the palm than the back view. The difference was ascribed to palm views' “offensive” pose: a source of threat that might have raised participants' arousal. This hypothesis was also supported by conscious evaluations of the dimensions of valence (pleasant-unpleasant) and arousal. Results are discussed in light of emotional feature coding.
The presence of non-simultaneous maskers can result in strong impairment in auditory intensity resolution relative to a condition without maskers, and causes a complex pattern of effects that is difficult to explain on the basis of peripheral processing. We suggest that the failure of selective attention to the target tones is a useful framework for understanding these effects. Two experiments tested the hypothesis that the sequential grouping of the targets and the maskers into separate auditory objects facilitates selective attention and therefore reduces the masker-induced impairment in intensity resolution. In Experiment 1, a condition favoring the processing of the maskers and the targets as two separate auditory objects due to grouping by temporal proximity was contrasted with the usual forward masking setting where the masker and the target presented within each observation interval of the two-interval task can be expected to be grouped together. As expected, the former condition resulted in a significantly smaller masker-induced elevation of the intensity difference limens (DLs). In Experiment 2, embedding the targets in an isochronous sequence of maskers led to a significantly smaller DL-elevation than control conditions not favoring the perception of the maskers as a separate auditory stream. The observed effects of grouping are compatible with the assumption that a precise representation of target intensity is available at the decision stage, but that this information is used only in a suboptimal fashion due to limitations of selective attention. The data can be explained within a framework of object-based attention. The results impose constraints on physiological models of intensity discrimination. We discuss candidate structures for physiological correlates of the psychophysical data.
The ability to detect sudden changes in the environment is critical for survival. Hearing is hypothesized to play a major role in this process by serving as an “early warning device,” rapidly directing attention to new events. Here, we investigate listeners' sensitivity to changes in complex acoustic scenes—what makes certain events “pop-out” and grab attention while others remain unnoticed? We use artificial “scenes” populated by multiple pure-tone components, each with a unique frequency and amplitude modulation rate. Importantly, these scenes lack semantic attributes, which may have confounded previous studies, thus allowing us to probe low-level processes involved in auditory change perception. Our results reveal a striking difference between “appear” and “disappear” events. Listeners are remarkably tuned to object appearance: change detection and identification performance are at ceiling; response times are short, with little effect of scene-size, suggesting a pop-out process. In contrast, listeners have difficulty detecting disappearing objects, even in small scenes: performance rapidly deteriorates with growing scene-size; response times are slow, and even when change is detected, the changed component is rarely successfully identified. We also measured change detection performance when a noise or silent gap was inserted at the time of change or when the scene was interrupted by a distractor that occurred at the time of change but did not mask any scene elements. Gaps adversely affected the processing of item appearance but not disappearance. However, distractors reduced both appearance and disappearance detection. Together, our results suggest a role for neural adaptation and sensitivity to transients in the process of auditory change detection, similar to what has been demonstrated for visual change detection. Importantly, listeners consistently performed better for item addition (relative to deletion) across all scene interruptions used, suggesting a robust perceptual representation of item appearance.
In everyday life, we need a capacity to flexibly shift attention between alternative sound sources. However, relatively little work has been done to elucidate the mechanisms of attention shifting in the auditory domain. Here, we used a mixed event-related/sparse-sampling fMRI approach to investigate this essential cognitive function. In each 10-sec trial, subjects were instructed to wait for an auditory “cue” signaling the location where a subsequent “target” sound was likely to be presented. The target was occasionally replaced by an unexpected “novel” sound in the uncued ear, to trigger involuntary attention shifting. To maximize the attention effects, cues, targets, and novels were embedded within dichotic 800-Hz vs. 1500-Hz pure-tone “standard” trains. The sound of clustered fMRI acquisition (starting at t = 7.82 sec) served as a controlled trial-end signal. Our approach revealed notable activation differences between the conditions. Cued voluntary attention shifting activated the superior intraparietal sulcus (IPS), whereas novelty-triggered involuntary orienting activated the inferior IPS and certain subareas of the precuneus. Clearly more widespread activations were observed during voluntary than involuntary orienting in the premotor cortex, including the frontal eye fields. Moreover, we found evidence for a frontoinsular-cingular attentional control network, consisting of the anterior insula, inferior frontal cortex, and medial frontal cortices, which were activated during both target discrimination and voluntary attention shifting. Finally, novels and targets activated much wider areas of superior temporal auditory cortices than shifting cues.
In sentence comprehension research, the case system, which is one of the subsystems of the language processing system, has been assumed to play a crucial role in signifying relationships in sentences between noun phrases (NPs) and other elements, such as verbs, prepositions, nouns, and tense. However, so far, less attention has been paid to the question of how cases are processed in our brain. To this end, the current study used fMRI and scanned the brain activity of 15 native English speakers during an English-case processing task. The results showed that, while the processing of all cases activates the left inferior frontal gyrus and posterior part of the middle temporal gyrus, genitive case processing activates these two regions more than nominative and accusative case processing. Since the effect of the difference in behavioral performance among these three cases is excluded from brain activation data, the observed different brain activations would be due to the different processing patterns among the cases, indicating that cases are processed differently in our brains. The different brain activations between genitive case processing and nominative/accusative case processing may be due to the difference in structural complexity between them.
In this study we sought to elucidate what mechanisms underlie the effects of trial history on information processing. We explicitly focused on the contribution of conflict control and S-R binding to sequential trial effects. Performance and brain activity were measured during two hours of continuous Stroop task performance. Mental fatigue, known to influence top-down processing, was used to elucidate separate effects via top-down and bottom-up mechanisms. Here we confirm that performance in the Stroop task is indeed strongly modulated by stimulus history. Performance was affected by the kind of advance information available; dependent on this information adjustments were made, resulting in differential effects of cognitive conflict, and S-R binding on subsequent performance. The influence of mental fatigue on information processing was mainly related to general effects on attention.
Given that both auditory and visual systems have anatomically separate object identification (“what”) and spatial (“where”) pathways, it is of interest whether attention-driven cross-sensory modulations occur separately within these feature domains. Here, we investigated how auditory “what” vs. “where” attention tasks modulate activity in visual pathways using cortically constrained source estimates of magnetoencephalograpic (MEG) oscillatory activity. In the absence of visual stimuli or tasks, subjects were presented with a sequence of auditory-stimulus pairs and instructed to selectively attend to phonetic (“what”) vs. spatial (“where”) aspects of these sounds, or to listen passively. To investigate sustained modulatory effects, oscillatory power was estimated from time periods between sound-pair presentations. In comparison to attention to sound locations, phonetic auditory attention was associated with stronger alpha (7–13 Hz) power in several visual areas (primary visual cortex; lingual, fusiform, and inferior temporal gyri, lateral occipital cortex), as well as in higher-order visual/multisensory areas including lateral/medial parietal and retrosplenial cortices. Region-of-interest (ROI) analyses of dynamic changes, from which the sustained effects had been removed, suggested further power increases during Attend Phoneme vs. Location centered at the alpha range 400–600 ms after the onset of second sound of each stimulus pair. These results suggest distinct modulations of visual system oscillatory activity during auditory attention to sound object identity (“what”) vs. sound location (“where”). The alpha modulations could be interpreted to reflect enhanced crossmodal inhibition of feature-specific visual pathways and adjacent audiovisual association areas during “what” vs. “where” auditory attention.
Theories on visual perception agree that visual recognition begins with global analysis and ends with detailed analysis. Different results from neurophysiological, computational, and behavioral studies all indicate that the totality of visual information is not immediately conveyed, but that information analysis follows a predominantly coarse-to-fine processing sequence (low spatial frequencies are extracted first, followed by high spatial frequencies). We tested whether such processing continues to occur in normally aging subjects. Young and aged participants performed a categorization task (indoor vs. outdoor scenes), using dynamic natural scene stimuli, in which they resorted to either a coarse-to-fine (CtF) sequence or a reverse fine-to-coarse sequence (FtC). The results show that young participants categorized CtF sequences more quickly than FtC sequences. However, sequence processing interacts with semantic category only for aged participants. The present data support the notion that CtF categorization is effective even in aged participants, but is constrained by the spatial features of the scenes, thus highlighting new perspectives in visual models.