|Home | About | Journals | Submit | Contact Us | Français|
Models of visual emotional perception suggest a reentrant organization of the ventral visual system with the amygdala. Using focused functional magnetic resonance imaging in humans with a sampling rate of 100 ms, here we determine the relative timing of emotional discrimination in amygdala and ventral visual cortical structures during emotional perception. Results show that amygdala and inferotemporal visual cortex differentiate emotional from nonemotional scenes approximately 1 s prior to extrastriate occipital cortex, while primary occipital cortex shows consistent activity across all scenes. This pattern of discrimination is consistent with a reentrant organization of emotional perception in visual processing, where transaction between rostral ventral visual cortex and amygdala originates the identification of emotional relevance.
Visual perception of emotionally arousing, relative to nonemotional stimuli is associated with greater blood oxygen level dependent (BOLD) signal across widespread regions of the ventral visual system, including inferotemporal (IT) and extrastriate occipital cortex (Bradley et al., 2003, Britton et al., 2006; Norris et al., 2004; Sabatinelli et al., 2005, 2007). While considerable evidence suggests a relationship between activity in amygdala and inferotemporal cortex during emotional perception (Armony & Dolan, 2002; Morris et al., 1998, Sabatinelli et al., 2005; Vuilleumier et al., 2004), the means by which ventral visual system differentiates emotional from nonemotional scenes is not well defined.
Visual emotional discrimination in non-human primates is hypothesized to result from reentrant feedback from the amygdala to ventral visual cortex (Amaral & Price, 1984; Iwai & Yukie, 1987; Spiegler & Mishkin, 1981). Specifically, Amaral and colleagues identify dense amygdala innervation into rostral inferotemporal cortex, with more sparse innervation in caudal occipital areas (Amaral, 2003; Amaral et al., 1992; Freese & Amaral, 2005). This dense interconnectivity, the high correlation between amygdala and inferotemporal activity (Armony & Dolan, 2002; Morris et al., 1998, Sabatinelli et al., 2005; Vuilleumier et al., 2004) and the role of the inferotemporal cortex in high-level visual perception (Chao et al., 1999; DeYoe & Van Essen, 1988; Grill-Spector & Malach, 2004) suggests that the process of emotional discrimination in human visual perception may originate during the interaction between amygdala and rostral inferotemporal cortex, and develops later in caudal extrastriate cortex. If emotional discrimination in amygdala and inferotemporal cortex precedes emotional discrimination in occipital cortex, a reentrant model of emotional perception would be supported. If no difference in the timing of emotional discrimination is evident across extrastriate occipital cortex, IT cortex, and amygdala, the reentrant model may be insufficiently described, the recording methodology used here too insensitive, or alternative models relevant to emotional perception exclusive of amygdala feedback may be more appropriate (i.e., Posner & Peterson, 1990, Heller, 1990).
It is possible to discriminate cortical sources in ventral visual cortex with surface-based electroencephalography (Keil et al., 2009; Sabatinelli et al., 2007), yet spatial resolution is considerably reduced in rostral inferior temporal areas, where the distance from the scalp is greatest (Russel et al., 1998; Schloffelen & Gross 2009). Conversely, differentiating the timing of activation in visual cortex is beyond the temporal resolution of whole-brain fMRI acquisition techniques. However, while BOLD contrast is inherently delayed relative to neural activity, the timing of signal change within active clusters is highly reliable (Kim et al., 1997; Menon & Kim 1999; Miezin et al., 2000). By comparing the time course of BOLD signal within regions of interest across experimental conditions, the effective temporal resolution is limited only by the sampling rate at which BOLD signal can be recorded, and the reliability of the signal. As we are not concernd with the relative timing of BOLD activation across regions, potential confounds regarding the variations in vascular anatomy, as well as individual differences in BOLD timing (Aguirre et al, 1998; Buxton et al., 1998), are avoided.
Here we record BOLD signal in amygdala, inferotemporal, extrastriate and striate occipital cortex in a single slice, as rapidly as signal quality will allow, while participants view an event-related series of emotionally arousing and neutral pictures. If emotional discrimination in visual perception originates via feedback between amygdala and rostral inferotemporal cortex, emotional discrimination in these regions (differential BOLD signal across arousing and neutral picture conditions) should occur earlier than in caudal occipital cortex.
Twenty undergraduate volunteers participated for course credit or $20 USD compensation. All volunteers consented to participate after reading a description of the study, approved by the local human subjects review board. Prior to entering the bore of the Siemens 3T Allegra MR scanner, participants were fitted with earplugs and given a patient-alarm squeezeball. A vacuum pillow, padding and explicit verbal instruction were used to limit head motion. Two participants’ data were excluded due to excessive head motion, 1 was lost due to scanner malfunction. The final sample included 10 males and 7 females (average age 18.8 years, range 18–21).
Participants were asked to maintain fixation on a dot at the center of a 7″ LCD screen mounted directly behind the head, visible via a coil-mounted mirror (IFIS MR-compatible hardware, Intermagnetics, Latham, NY). After 3 acclimation trials in which checkerboard stimuli were presented, a series of 24 picture stimuli were presented (25° visual angle) in an event-related design. The picture stimuli were chosen from the International Affective Picture System (IAPS; Lang et al., http://csea.phhp.ufl.edu/Media.html) and all depicted people, including 8 exemplars each of highly arousing erotic couples (pleasant- 4611, 4658, 4659, 4669, 4676, 4680, 4690, and 4694), moderately arousing neutral people (neutral- 2037, 2102, 2305, 2383, 2393, 2396, 2513, and 2595), and highly arousing mutilations (unpleasant- 3000, 3030, 3060, 3068, 3069, 3100, 3102, and 3225). The pleasant and unpleasant pictures were selected to be equivalent in normative ratings of emotional arousal (see Table 1). All picture stimuli were converted to grayscale and matched for luminance and 90% quality JPEG file size by category using Adobe Photoshop® 7. Each picture was presented for 3 s, followed by a 9 s fixation-only period. Picture order was pseudo-randomized, allowing no more than 2 successive presentations of a stimulus category. The picture series was repeated (in unique orders) in 3 additional blocks, for a total of 96 trials over ~23 minutes.
Once participants were comfortable inside the bore, an 8 min T1-weighted 3D structural volume was collected. The prescription specified 160 sagittal slices, with 1mm isotropic voxels in a 256 mm field of view. In addition, a single T1-weighted slice was acquired at the location of the single functional slice acquisition, described below. Following structural acquisitions, the 5mm slice prescription (Gradient echo, echoplanar 64×64, 180mm FOV, 25o flip angle, 30 ms TE, 100 ms TR) was oriented in an oblique axial plane such that sampling of amygdala, inferotemporal cortex, and middle occipital gyrus could be obtained. The placement was tailored for each participant, originating with coverage of amygdala, and tilted for optimum coverage of visual areas of interest, and if possible, to exclude sinus cavity coverage, as a means of limiting susceptibility artifact. This led to substantial sampling of the calcarine fissure. The single-slice prescription allows 40ml voxels to be repetitively sampled at a temporal resolution of 100 ms.
Each participant’s 96-trial functional time series was linearly detrended, temporally smoothed with a 1 s Gaussian filter, and spatially smoothed across 2 voxels (5.625 mm FWHM) using BrainVoyager QX 1.8 (Brain Innovation, Maastrict, The Netherlands). Temporal smoothing was necessary to reduce the effects of physiological noise present in the time series at a 10 Hz sampling rate. Trials with residual head motion were removed manually, by identifying large (greater then 4 times the background variation) and brief spikes located by examining the average time series intensity across a majority of the voxels in the slice (a rectangular region of greater than half the voxels within the brain). This procedure resulted in the removal of less than 2% of total trials, and no more than 4 trials from any subject. Average signal to noise [(signal – noise)/sd noise] ratios calculated from the image data were 80.5 in calcarine fissure, 81.0 in middle occipital gyrus, 97.8 in inferotemporal cortex, and 79.4 in amygdala.
The processed image series were entered into single subject analyses of variance, identifying BOLD signal change evoked by the 3 picture contents (erotica, neutral people, and mutilations), using a standard two-gamma hemodynamic response function (Boynton et al., 1996). A false discovery rate (Genovese et al., 2002) of p <.01 was used to threshold each participant’s data. From these functional maps, 4 regions were sampled including bilateral amygdala, inferotemporal cortex, middle occipital gyrus, and midline calcarine fissure. Each region was sampled across 9 voxels (356 ml) using axial neuroanatomical atlases (Haines, 1995; Talairach & Tournoux, 1988) as guides (see Figure 1). The variability of placement across subjects for all ROIs was minimized as much as possible, balancing the need for consistency across subjects with sensitivity to the central location of significant clusters within subject.
The effect of picture content on BOLD signal at the peak of the response (4–8 s after picture onset) was greater during pleasant and unpleasant, relative to neutral picture presentations in middle occipital gyrus (Content F (2,15) = 49.93, p <. 001; quadratic F (1,16) = 64.34, p <.001), inferotemporal cortex (Content F (2,15) = 26.67, p <. 001; quadratic F (1,16) = 51.62, p <.001), and amygdala (Content F (2,15) = 6.02, p <. 05; quadratic F (1,16) = 12.72, p <.01). Pleasant and unpleasant pictures led to equivalent BOLD signal increase across the three regions (no linear effects approached significance). No effects of picture content were found in calcarine fissure.
To reliably identify the point at which emotion-specific BOLD signal increases occurred in amygdala, inferotemporal, and middle occipital gyrus, non-parametric permutation tests (Maris 2004; Maris and Oostenveld 2007) were computed for each time point and region in the first 5 seconds (50 time points) of picture presentation. Labels encoding picture arousal (pleasant and unpleasant pictures led to equivalent responses) were randomly reassigned in 10,000 draws, and checked for independence from previous permutation orders. A repeated measures F-statistic was then generated for each time point, and a Gaussian function fit to the distribution. The value of the F-statistic used in forming the permutation distribution was computed as the 99.9th percentile of the distribution described by this fitted Gaussian (p <.01). The time after picture onset at which this threshold was met are 3.9 s for middle occipital gyrus, 2.5 s for inferotemporal cortex, and 2.9 s for amygdala (see arrows on the abscissa in Figure 2).
A second test including structure (inferotemporal and middle occipital ROIs) and emotional discrimination (arousing and nonarousing pictures) as factors yielded an interaction of ROI and arousal that was reliable (p <.05) from 3.4 s to 3.9 s after picture onset, indicating a significantly earlier discrimination of emotional arousal in inferotemporal regions relative to middle occipital gyrus.
These data show that emotion-related increases BOLD signal change are reliable in amygdala and inferotemporal visual cortex approximately 1 s prior to extrastriate occipital cortex. As the sequence of basic visual processing stages places extrastriate cortex (V2) ahead of inferotemporal areas (Desimone & Ungerleider, 1989; DeYoe & Van Essen, 1988), evidence of later discrimination supports the perspective (Amaral et al., 1992, Lang et al., 1997; Shi & Davis, 2001; Vuilleumier, 2005) that emotional significance is identified by some means in the amygdala, through which feedback to rostral IT, and eventually caudal occipital areas, leads to enhanced perceptual processing, and ‘motivated attention’.
The integration of emotional significance into visual perception as a result of amygdala – rostral - caudal recurrent processing fits well with conceptions of complex scene processing as an iterative, non-hierarchical mechanism (Grill-Spector & Malach, 2004; Hegde & Felleman, 2007; Lamme & Roelfsema, 2000). In the primate, the initial inferotemporal cortical response to a picture of a conspecific is thought to reflect global categorization of the percept, and is followed by a more sustained response that it associated with detail factors such as identity and facial expression (Nakamura et al., 1994; Nishijo et al., 2007; Sugase et al., 1999). The timing of this later stage of detail processing is consistent with estimates of categorization latency in human research (Codispoti et al., 2006; Junghofer et al 2001; Tsuchiya et al., 2008; VanRullen & Thorpe, 2001). Two intracranial studies of have shown human amygdala differentiation of aversive from neutral pictures (Oya et al., 2002) and facial expressions (Krolak-Salmon et al., 2004) beginning 150–200 ms after stimulus onset. We speculate that in the current dataset, it is this later processing stage which may underlie the increased signal present in amygdala and IT cortex, necessarily delayed and smoothed through hemodynamic BOLD contrast.
The statistical thresholds resulting from the permutation resampling procedure enabled us to identify the time point at which BOLD signal in our regions of interest emotionally arousing trials differed from nonarousing trials. This analysis shows that the amygdala and inferotemporal cortex discriminated picture emotionality prior to middle occipital gyrus. However, the inferotemporal cortex shows statistically reliable differention prior to amygdala (2.5 s vs. 2.9 s after picture onset, see Figure 2). However, a 2 factor test of amygdala and inferotemporal arousal discrimination yields no interaction, and thus the difference in discrimination onset is not reliable. Perhaps more importantly, a demonstration that the inferotemporal cortex does not differentiate emotional stimuli in the absence of amygdala input (Vuilleumier et al., 2004) suggests that the discrimination originates in the amygdala, and is brought about in inferotemporal areas soon thereafter.
Other means to assess the timing of BOLD signal can approximate high temporal resolution of neural activity, such as image acquisition jitter and modelling of undersampled responses (Alpert et al, 2007; Duff et al., 2007; Lee et al., 2005). In a study of fear-relvant picture processing (Larson et al., 2006), amygdala activity was recorded in spider phobics and controls using a jittered image acquisition, with an effective sampling rate of 300ms. Modelling of the BOLD signal response yielded a reduced latency of more than 1 second in phobics relative to controls in response to spider pictures, yet no difference in signal amplitude. Modelling of BOLD signal is undoubtably a powerful means of assessing the time course of BOLD signal, yet dependent on the accurate specification of many underlying factors. Here we intended to exploit the capability of data collection to the fullest extent, and thus reduce the chance of mischaracterization of the data.
While the timing of emotional discrimination in amygdala and inferotemporal cortex relative to occipital cortex is consistent with a reentrant model of emotional perception, the support is inferential. Beyond the simple timing of emotion differentiation across structures, predictive time-series techniques such as Granger causality analyses could provide support for the superordinate role of amygdala differentiation in emotional perception. In the framework of Granger causality, one measured process is said to be causal to a second if the predictability of the second process at a given time point is improved by including measures from the history of the first process. The current dataset did not lend itself to such analyses, as the brief picture periods and event-related design precluded stationary neural processing states of sufficient duration. Future work using fast-sampled fMRI and experimental designs allowing extended, stable periods of emotional processing may be more suitable.
The early difference (1–3 s) in amygdala signal across picture contents can be attributed to both an early increase during arousing pictures, and a transient decrease at the onset of neutral pictures. This decrease in amygdala signal in response to nonarousing conditions in studies of emotional processing has been reported in several fMRI studies, (Armony & Dolan, 2002; Morris et al., 2002; Wright et al., 2001), however the mechanism is as yet unknown. Future work explicitly controlling the predictability of stimulus conditions and timing may shed light on this possibility.
A tradeoff of focused image acquisition is reduced coverage. This study is intended to test the hypothesis that specific regions of the visual system differed from another in the timing of BOLD change across conditions, which were capable of being sampled in a single plane with the amygdala. Of course there may be other areas of the brain that show emotional differentiation at earlier or later points, and these possibilities can be addressed in additional studies.
In summary, these data show in a human sample the relative timing of emotional discrimination across amygdala and ventral visual cortical structures during emotional perception. In this analysis, amygdala and inferotemporal cortex differentiate emotional from nonemotional scenes approximately 1 s prior to secondary occipital cortex, while primary occiptal cortex shows consistent activity across all scenes. This pattern of discrimination is consistent with a reentrant organization of emotional perception in visual processing, where transaction between rostral ventral visual cortex and amygdala originates the identification of emotional relevance.
This work was supported by National Institutes of Mental Health Grant P50-MH-072850.