Evidence from functional neuroimaging indicates that visual perception of human faces and bodies is carried out by distributed networks of face and body-sensitive areas in the occipito-temporal cortex. However, the dynamics of activity in these areas, needed to understand their respective functional roles, are still largely unknown. We monitored brain activity with millisecond time resolution by recording magnetoencephalographic (MEG) responses while participants viewed photographs of faces, bodies, and control stimuli. The cortical activity underlying the evoked responses was estimated with anatomically-constrained noise-normalised minimum-norm estimate and statistically analysed with spatiotemporal cluster analysis.
Our findings point to distinct spatiotemporal organization of the neural systems for face and body perception. Face-selective cortical currents were found at early latencies (120–200 ms) in a widespread occipito-temporal network including the ventral temporal cortex (VTC). In contrast, early body-related responses were confined to the lateral occipito-temporal cortex (LOTC). These were followed by strong sustained body-selective responses in the orbitofrontal cortex from 200–700 ms, and in the lateral temporal cortex and VTC after 500 ms latency. Our data suggest that the VTC region has a key role in the early processing of faces, but not of bodies. Instead, the LOTC, which includes the extra-striate body area (EBA), appears the dominant area for early body perception, whereas the VTC contributes to late and post-perceptual processing.
Just like other face dimensions, age influences the way faces are processed by adults as well as by children. However, it remains unclear under what conditions exactly such influence occurs at both ages, in that there is some mixed evidence concerning the presence of a systematic processing advantage for peer faces (own-age bias) across the lifespan. Inconsistency in the results may stem from the fact that the individual’s face representation adapts to represent the most predominant age traits of the faces present in the environment, which is reflective of the individual’s specific living conditions and social experience. In the current study we investigated the processing of younger and older adult faces in two groups of adults (Experiment 1) and two groups of 3-year-old children (Experiment 2) who accumulated different amounts of experience with elderly people. Contact with elderly adults influenced the extent to which both adult and child participants showed greater discrimination abilities and stronger sensitivity to configural/featural cues in younger versus older adult faces, as measured by the size of the inversion effect. In children, the size of the inversion effect for older adult faces was also significantly correlated with the amount of contact with elderly people. These results show that, in both adults and children, visual experience with older adult faces can tune perceptual processing strategies to the point of abolishing the discrimination disadvantage that participants typically manifest for those faces in comparison to younger adult faces.
Complex behavior typically relies upon many different processes which are related to activity in multiple brain regions. In contrast, neuroimaging analyses typically focus upon isolated processes. Here we present a new approach, combinatorial brain decoding, in which we decode complex behavior by combining the information which we can retrieve from the neural signals about the many different sub-processes. The case in point is visuospatial navigation. We explore the extent to which the route travelled by human subjects (N = 3) in a complex virtual maze can be decoded from activity patterns as measured with functional magnetic resonance imaging. Preliminary analyses suggest that it is difficult to directly decode spatial position from regions known to contain an explicit cognitive map of the environment, such as the hippocampus. Instead, we were able to indirectly derive spatial position from the pattern of activity in visual and motor cortex. The non-spatial representations in these regions reflect processes which are inherent to navigation, such as which stimuli are perceived at which point in time and which motor movement is executed when (e.g., turning left at a crossroad). Highly successful decoding of routes followed through the maze was possible by combining information about multiple aspects of navigation events across time and across multiple cortical regions. This “proof of principle” study highlights how visuospatial navigation is related to the combined activity of multiple brain regions, and establishes combinatorial brain decoding as a means to study complex mental events that involve a dynamic interplay of many cognitive processes.
fMRI; pattern classification; visual cortex; motor cortex; objects and faces
Perceptual expertise has been studied intensively with faces and object categories involving detailed individuation. A common finding is that experience in fulfilling the task demand of fine, subordinate-level discrimination between highly similar instances is associated with the development of holistic processing. This study examines whether holistic processing is also engaged by expert word recognition, which is thought to involve coarser, basic-level processing that is more part-based. We adopted a paradigm widely used for faces – the composite task, and found clear evidence of holistic processing for English words. A second experiment further showed that holistic processing for words was sensitive to the amount of experience with the language concerned (native vs. second-language readers) and with the specific stimuli (words vs. pseudowords). The adoption of a paradigm from the face perception literature to the study of expert word perception is important for further comparison between perceptual expertise with words and face-like expertise.
Previous research has shown that the extent to which people spread attention across the visual field plays a crucial role in visual selection and the occurrence of bottom-up driven attentional capture. Consistent with previous findings, we show that when attention was diffusely distributed across the visual field while searching for a shape singleton, an irrelevant salient color singleton captured attention. However, while using the very same displays and task, no capture was observed when observers initially focused their attention at the center of the display. Using event-related fMRI, we examined the modulation of retinotopic activity related to attentional capture in early visual areas. Because the sensory display characteristics were identical in both conditions, we were able to isolate the brain activity associated with exogenous attentional capture. The results show that spreading of attention leads to increased bottom-up exogenous capture and increased activity in visual area V3 but not in V2 and V1.
The goal of the present study was to examine the extent to which working memory supports the maintenance of object locations during active spatial navigation. Participants were required to navigate a virtual environment and to encode the location of a target object. In the subsequent maintenance period they performed one of three secondary tasks that were designed to selectively load visual, verbal or spatial working memory subsystems. Thereafter participants re-entered the environment and navigated back to the remembered location of the target. We found that while navigation performance in participants with high navigational ability was impaired only by the spatial secondary task, navigation performance in participants with poor navigational ability was impaired equally by spatial and verbal secondary tasks. The visual secondary task had no effect on navigation performance. Our results extend current knowledge by showing that the differential engagement of working memory subsystems is determined by navigational ability.
Object vision in human and nonhuman primates is often cited as a primary example of adult plasticity in neural information processing. It has been hypothesized that visual experience leads to single neurons in the monkey brain with strong selectivity for complex objects, and to regions in the human brain with a preference for particular categories of highly familiar objects. This view suggests that adult visual experience causes dramatic local changes in the response properties of high-level visual cortex. Here, we review the current neurophysiological and neuroimaging evidence and find that the available data support a different conclusion: adult visual experience introduces moderate, relatively distributed effects that modulate a pre-existing, rich and flexible set of neural object representations.
Cultural differences in socialization can lead to characteristic differences in how we perceive the world. Consistent with this influence of differential experience, our perception of faces (e.g., preference, recognition ability) is shaped by our previous experience with different groups of individuals.
Here, we examined whether cultural differences in social practices influence our perception of faces. Japanese, Chinese, and Asian-Canadian young adults made relative age judgments (i.e., which of these two faces is older?) for East Asian faces. Cross-cultural differences in the emphasis on respect for older individuals was reflected in participants' latency in facial age judgments for middle-age adult faces—with the Japanese young adults performing the fastest, followed by the Chinese, then the Asian-Canadians. In addition, consistent with the differential behavioural and linguistic markers used in the Japanese culture when interacting with individuals younger than oneself, only the Japanese young adults showed an advantage in judging the relative age of children's faces.
Our results show that different sociocultural practices shape our efficiency in processing facial age information. The impact of culture may potentially calibrate other aspects of face processing.
Auditory training programs are being developed to remediate various types of communication disorders. Biological changes have been shown to coincide with improved perception following auditory training so there is interest in determining if these changes represent biologic markers of auditory learning. Here we examine the role of stimulus exposure and listening tasks, in the absence of training, on the modulation of evoked brain activity. Twenty adults were divided into two groups and exposed to two similar sounding speech syllables during four electrophysiological recording sessions (24 hours, one week, and up to one year later). In between each session, members of one group were asked to identify each stimulus. Both groups showed enhanced neural activity from session-to-session, in the same P2 latency range previously identified as being responsive to auditory training. The enhancement effect was most pronounced over temporal-occipital scalp regions and largest for the group who participated in the identification task. The effects were rapid and long-lasting with enhanced synchronous activity persisting months after the last auditory experience. Physiological changes did not coincide with perceptual changes so results are interpreted to mean stimulus exposure, with and without being paired with an identification task, alters the way sound is processed in the brain. The cumulative effect likely involves auditory memory; however, in the absence of training, the observed physiological changes are insufficient to result in changes in learned behavior.
The mammalian visual system contains an extensive web of feedback connections projecting from “higher” cortical areas to “lower” areas including primary visual cortex. Although multiple theories have been proposed, the role of these connections in perceptual processing is not understood. Here we report a surprising new phenomenon not predicted by prior theories of feedback: the pattern of fMRI response in human foveal retinotopic cortex contains information about objects presented in the periphery, far away from the fovea. This information is position invariant, correlated with perceptual discrimination accuracy, and found only in foveal, not peripheral, retinotopic cortex. Our data cannot be explained by differential eye movements, activation from the fixation cross, or spillover activation from peripheral retinotopic cortex or from LOC. Instead, our findings indicate that position-invariant object information from higher cortical areas is fed back to foveal retinotopic cortex, enhancing task performance.
A variety of similarities between visual and haptic object recognition suggests that the two modalities may share common representations. However, it is unclear whether such common representations preserve low-level perceptual features or whether transfer between vision and haptics is mediated by high-level, abstract representations. Two experiments used a sequential shape-matching task to examine the effects of size changes on unimodal and crossmodal visual and haptic object recognition. Participants felt or saw 3D plastic models of familiar objects. The two objects presented on a trial were either the same size or different sizes and were the same shape or different but similar shapes. Participants were told to ignore size changes and to match on shape alone. In Experiment 1, size changes on same-shape trials impaired performance similarly for both visual-to-visual and haptic-to-haptic shape matching. In Experiment 2, size changes impaired performance on both visual-to-haptic and haptic-to-visual shape matching and there was no interaction between the cost of size changes and direction of transfer. Together the unimodal and crossmodal matching results suggest that the same, size-specific perceptual representations underlie both visual and haptic object recognition, and indicate that crossmodal memory for objects must be at least partly based on common perceptual representations.
Normal aging significantly influences motor and cognitive performance. Little is known about age-related changes in action simulation. Here, we investigated the influence of aging on implicit motor imagery.
Twenty young (mean age: 23.9±2.8 years) and nineteen elderly (mean age: 78.3±4.5 years) subjects, all right-handed, were required to determine the laterality of hands presented in various positions. To do so, they mentally rotated their hands to match them with the hand-stimuli. We showed that: (1) elderly subjects were affected in their ability to implicitly simulate movements of the upper limbs, especially those requiring the largest amplitude of displacement and/or with strong biomechanical constraints; (2) this decline was greater for movements of the non-dominant arm than of the dominant arm.
These results extend recent findings showing age-related alterations of the explicit side of motor imagery. They suggest that a general decline in action simulation occurs with normal aging, in particular for the non-dominant side of the body.
Neuroimaging research over the past decade has revealed a detailed picture of the functional organization of the human brain. Here we focus on two fundamental questions that are raised by the detailed mapping of sensory and cognitive functions and illustrate these questions with findings from the object-vision pathway. First, are functionally specific regions that are located close together best understood as distinct cortical modules or as parts of a larger-scale cortical map? Second, what functional properties define each cortical map or module? We propose a model in which overlapping continuous maps of simple features give rise to discrete modules that are selective for complex stimuli.
The inferior temporal (IT) cortex in monkeys plays a central role in visual object recognition and learning. Previous studies have observed patches in IT cortex with strong selectivity for highly familiar object classes (e.g., faces), but the principles behind this functional organization are largely unknown due to the many properties that distinguish different object classes. To unconfound shape from meaning and memory, we scanned monkeys with functional magnetic resonance imaging while they viewed classes of initially novel objects. Our data revealed a topography of selectivity for these novel object classes across IT cortex. We found that this selectivity topography was highly reproducible and remarkably stable across a 3-month interval during which monkeys were extensively trained to discriminate among exemplars within one of the object classes. Furthermore, this selectivity topography was largely unaffected by changes in behavioral task and object retinal position, both of which preserve shape. In contrast, it was strongly influenced by changes in object shape. The topography was partially related to, but not explained by, the previously described pattern of face selectivity. Together, these results suggest that IT cortex contains a large-scale map of shape that is largely independent of meaning, familiarity, and behavioral task.
categorization; fMRI; learning; object recognition; primate; visual perception
Faces are arguably one of the most important object categories encountered by human observers, yet they present one of the most difficult challenges to both the human and artificial visual systems. A variety of experimental paradigms have been developed to study how faces are represented and recognized, among which is the part-spacing paradigm. This paradigm is presumed to characterize the processing of both the featural and configural information of faces, and it has become increasingly popular for testing hypotheses on face specificity and in the diagnosis of face perception in cognitive disorders.
In two experiments we questioned the validity of the part task of this paradigm by showing that, in this task, measuring pure information about face parts is confounded by the effect of face configuration on the perception of those parts. First, we eliminated or reduced contributions from face configuration by either rearranging face parts into a non-face configuration or by removing the low spatial frequencies of face images. We found that face parts were no longer sensitive to inversion, suggesting that the previously reported inversion effect observed in the part task was due in fact to the presence of face configuration. Second, self-reported prosopagnosic patients who were selectively impaired in the holistic processing of faces failed to detect part changes when face configurations were presented. When face configurations were scrambled, however, their performance was as good as that of normal controls.
In sum, consistent evidence from testing both normal and prosopagnosic subjects suggests the part task of the part-spacing paradigm is not an appropriate task for either measuring how face parts alone are processed or for providing a valid contrast to the spacing task. Therefore, conclusions from previous studies using the part-spacing paradigm may need re-evaluation with proper paradigms.
The certainty of judgment (or self-confidence) has been traditionally studied in relation with the accuracy. However, from an observer's viewpoint, certainty may be more closely related to the consistency of judgment than to its accuracy: consistent judgments are objectively certain in the sense that any external observer can rely on these judgments to happen. The regions of certain vs. uncertain judgment were determined in a categorical rating experiment. The participants rated the size of visual objects on a 5-point scale. There was no feedback so that there were no constraints of accuracy. Individual data was examined, and the ratings were characterized by their frequency distributions (or categories). The main result was that the individual categories always presented a core of certainty where judgment was totally consistent, and large peripheries where judgment was inconsistent. In addition, the geometry of cores and boundaries exhibited several phenomena compatible with the literature on visual categorical judgment. The ubiquitous presence of cores in absence of accuracy constraints provided insights about objective certainty that may complement the literature on subjective certainty (self-confidence) and the accuracy of judgment.
Neural mechanisms underlying invariant behaviour such as object recognition are not well understood. For brain regions critical for object recognition, such as inferior temporal cortex (ITC), there is now ample evidence indicating that single cells code for many stimulus aspects, implying that only a moderate degree of invariance is present. However, recent theoretical and empirical work seems to suggest that integrating responses of multiple non-invariant units may produce invariant representations at population level. We provide an explicit test for the hypothesis that a linear read-out mechanism of a pool of units resembling ITC neurons may achieve invariant performance in an identification task. A linear classifier was trained to decode a particular value in a 2-D stimulus space using as input the response pattern across a population of units. Only one dimension was relevant for the task, and the stimulus location on the irrelevant dimension (ID) was kept constant during training. In a series of identification tests, the stimulus location on the relevant dimension (RD) and ID was manipulated, yielding estimates for both the level of sensitivity and tolerance reached by the network. We studied the effects of several single-cell characteristics as well as population characteristics typically considered in the literature, but found little support for the hypothesis. While the classifier averages out effects of idiosyncratic tuning properties and inter-unit variability, its invariance is very much determined by the (hypothetical) ‘average’ neuron. Consequently, even at population level there exists a fundamental trade-off between selectivity and tolerance, and invariant behaviour does not emerge spontaneously.
object recognition; inferior temporal cortex; population coding; multidimensional tuning
Visual input from the left and right visual fields is processed predominantly in the contralateral hemisphere. Here we investigated whether this preference for contralateral over ipsilateral stimuli is also found in high-level visual areas that are important for the recognition of objects and faces. Human subjects were scanned with functional magnetic resonance imaging (fMRI) while they viewed and attended faces, objects, scenes, and scrambled images in the left or right visual field. With our stimulation protocol, primary visual cortex responded only to contralateral stimuli. The contralateral preference was smaller in object- and face-selective regions, and it was smallest in the fusiform gyrus. Nevertheless, each region showed a significant preference for contralateral stimuli. These results indicate that sensitivity to stimulus position is present even in high-level ventral visual cortex.
A classification image (CI) technique has shown that static luminance noise near visually completed contours affects the discrimination of fat and thin Kanizsa shapes. These influential noise regions were proposed to reveal “behavioral receptive fields” of completed contours–the same regions to which early cortical cells respond in neurophysiological studies of contour completion. Here, we hypothesized that 1) influential noise regions correspond to the surfaces that distinguish fat and thin shapes (hereafter, key regions); and 2) key region noise biases a “fat” response to the extent that its contrast polarity (lighter or darker than background) matches the shape's filled-in surface color.
To test our hypothesis, we had observers discriminate fat and thin noise-embedded rectangles that were defined by either illusory or luminance-defined contours (Experiment 1). Surrounding elements (“inducers”) caused the shapes to appear either lighter or darker than the background–a process sometimes referred to as lightness induction. For both illusory and luminance-defined rectangles, key region noise biased a fat response to the extent that its contrast polarity (light or dark) matched the induced surface color. When lightness induction was minimized, luminance noise had no consistent influence on shape discrimination. This pattern arose when pixels immediately adjacent to the discriminated boundaries were excluded from the analysis (Experiment 2) and also when the noise was restricted to the key regions so that the noise never overlapped with the physically visible edges (Experiment 3). The lightness effects did not occur in the absence of enclosing boundaries (Experiment 4).
Under noisy conditions, lightness induction alters visually completed shape. Moreover, behavioral receptive fields derived in CI studies do not correspond to contours per se but to filled-in surface regions contained by those contours. The relevance of lightness to two-dimensional shape completion supplies a new constraint for models of object perception.
Within the range of images that we might categorize as a “beach”, for example, some will be more representative of that category than others. Here we first confirmed that humans could categorize “good” exemplars better than “bad” exemplars of six scene categories and then explored whether brain regions previously implicated in natural scene categorization showed a similar sensitivity to how well an image exemplifies a category. In a behavioral experiment participants were more accurate and faster at categorizing good than bad exemplars of natural scenes. In an fMRI experiment participants passively viewed blocks of good or bad exemplars from the same six categories. A multi-voxel pattern classifier trained to discriminate among category blocks showed higher decoding accuracy for good than bad exemplars in the PPA, RSC and V1. This difference in decoding accuracy cannot be explained by differences in overall BOLD signal, as average BOLD activity was either equivalent or higher for bad than good scenes in these areas. These results provide further evidence that V1, RSC and the PPA not only contain information relevant for natural scene categorization, but their activity patterns mirror the fundamentally graded nature of human categories. Analysis of the image statistics of our good and bad exemplars shows that variability in low-level features and image structure is higher among bad than good exemplars. A simulation of our neuroimaging experiment suggests that such a difference in variance could account for the observed differences in decoding accuracy. These results are consistent with both low-level models of scene categorization and models that build categories around a prototype.
Visual experience plays an important role in the development of the visual cortex; however, recent functional imaging studies have shown that the functional organization is preserved in several higher-tier visual areas in congenitally blind subjects, indicating that maturation of visual areas depend unequally on visual experience. In this study, we aim to validate this hypothesis using a multimodality MRI approach. We found increased cortical thickness in the congenitally blind was present in the early visual areas and absent in the higher-tier ones, suggesting that the structural development of the visual cortex depends hierarchically on visual experience. In congenitally blind subjects, the decreased resting-state functional connectivity with the primary somatosensory cortex was more prominent in the early visual areas than in the higher-tier ones and were more pronounced in the ventral stream than in the dorsal one, suggesting that the development of functional organization of the visual cortex also depends differently on visual experience. Moreover, congenitally blind subjects showed normal or increased functional connectivity between ipsilateral higher-tier and early visual areas, suggesting an indirect corticocortical pathway through which somatosenroy information can reach the early visual areas. These findings support our hypothesis that the development of visual areas depends differently on visual experience.
Prior studies have shown that spatial attention modulates early visual cortex retinotopically, resulting in enhanced processing of external perceptual representations. However, it is not clear whether the same visual areas are modulated when attention is focused on, and shifted within a working memory representation. In the current fMRI study participants were asked to memorize an array containing four stimuli. After a delay, participants were presented with a verbal cue instructing them to actively maintain the location of one of the stimuli in working memory. Additionally, on a number of trials a second verbal cue instructed participants to switch attention to the location of another stimulus within the memorized representation. Results of the study showed that changes in the BOLD pattern closely followed the locus of attention within the working memory representation. A decrease in BOLD-activity (V1–V3) was observed at ROIs coding a memory location when participants switched away from this location, whereas an increase was observed when participants switched towards this location. Continuous increased activity was obtained at the memorized location when participants did not switch. This study shows that shifting attention within memory representations activates the earliest parts of visual cortex (including V1) in a retinotopic fashion. We conclude that even in the absence of visual stimulation, early visual areas support shifting of attention within memorized representations, similar to when attention is shifted in the outside world. The relationship between visual working memory and visual mental imagery is discussed in light of the current findings.
There are no known biological measures that accurately predict future development of psychiatric disorders in individual at-risk adolescents. We investigated whether machine learning and fMRI could help to: 1. differentiate healthy adolescents genetically at-risk for bipolar disorder and other Axis I psychiatric disorders from healthy adolescents at low risk of developing these disorders; 2. identify those healthy genetically at-risk adolescents who were most likely to develop future Axis I disorders.
16 healthy offspring genetically at risk for bipolar disorder and other Axis I disorders by virtue of having a parent with bipolar disorder and 16 healthy, age- and gender-matched low-risk offspring of healthy parents with no history of psychiatric disorders (12–17 year-olds) performed two emotional face gender-labeling tasks (happy/neutral; fearful/neutral) during fMRI. We used Gaussian Process Classifiers (GPC), a machine learning approach that assigns a predictive probability of group membership to an individual person, to differentiate groups and to identify those at-risk adolescents most likely to develop future Axis I disorders.
Using GPC, activity to neutral faces presented during the happy experiment accurately and significantly differentiated groups, achieving 75% accuracy (sensitivity = 75%, specificity = 75%). Furthermore, predictive probabilities were significantly higher for those at-risk adolescents who subsequently developed an Axis I disorder than for those at-risk adolescents remaining healthy at follow-up.
We show that a combination of two promising techniques, machine learning and neuroimaging, not only discriminates healthy low-risk from healthy adolescents genetically at-risk for Axis I disorders, but may ultimately help to predict which at-risk adolescents subsequently develop these disorders.
Rapid identification of facial expressions can profoundly affect social interactions, yet most research to date has focused on static rather than dynamic expressions. In four experiments, we show that when a non-expressive face becomes expressive, happiness is detected more rapidly anger. When the change occurs peripheral to the focus of attention, however, dynamic anger is better detected when it appears in the left visual field (LVF), whereas dynamic happiness is better detected in the right visual field (RVF), consistent with hemispheric differences in the processing of approach- and avoidance-relevant stimuli. The central advantage for happiness is nevertheless the more robust effect, persisting even when information of either high or low spatial frequency is eliminated. Indeed, a survey of past research on the visual search for emotional expressions finds better support for a happiness detection advantage, and the explanation may lie in the coevolution of the signal and the receiver.