|Home | About | Journals | Submit | Contact Us | Français|
It is well established that the formation of memories for life’s experiences—episodic memory—is influenced by how we attend to those experiences, yet the neural mechanisms by which attention shapes episodic encoding are still unclear. We investigated how top-down and bottom-up attention contribute to memory encoding of visual objects in humans by manipulating both types of attention during functional magnetic resonance imaging (fMRI) of episodic memory formation. We show that dorsal parietal cortex—specifically, intraparietal sulcus (IPS)—was engaged during top-down attention and was also recruited during the successful formation of episodic memories. By contrast, bottom-up attention engaged ventral parietal cortex—specifically, temporoparietal junction (TPJ)—and was also more active during encoding failure. Functional connectivity analyses revealed further dissociations in how top-down and bottom-up attention influenced encoding: while both IPS and TPJ influenced activity in perceptual cortices thought to represent the information being encoded (fusiform/lateral occipital cortex), they each exerted opposite effects on memory encoding. Specifically, during a preparatory period preceding stimulus presentation, a stronger drive from IPS was associated with a higher likelihood that the subsequently attended stimulus would be encoded. By contrast, during stimulus processing, stronger connectivity with TPJ was associated with a lower likelihood the stimulus would be successfully encoded. These findings suggest that during encoding of visual objects into episodic memory, top-down and bottom-up attention can have opposite influences on perceptual areas that subserve visual object representation, suggesting that one manner in which attention modulates memory is by altering the perceptual processing of to-be-encoded stimuli.
The creation of new memories for life’s experiences is intimately linked to how we attend (Craik & Lockhart, 1972). During an event, we may willfully direct attention to information relevant to ongoing goals (‘top-down’ attention) or our attention may be captured by salient information (‘bottom-up’ attention) (Corbetta & Shulman, 2002; Kastner & Ungerleider, 2000); how such attentive acts shape event encoding is unclear.
The neurobiology of top-down and bottom-up attention is posited to consist of separable yet interacting fronto-parietal networks (Corbetta & Shulman, 2002), and recent work has begun to consider how the parietal components of these networks are engaged during event encoding (Uncapher & Wagner, 2009). During attention paradigms, functional imaging data indicate that dorsal posterior parietal cortex (dPPC) activity tracks demands on top-down attention, and ventral posterior parietal cortex (vPPC) activity tracks recruitment of bottom-up attention (reviewed in Corbetta et al., 2008). During event encoding, fMRI signal in frontal and parietal regions frequently predicts whether the event will be subsequently remembered or forgotten (‘subsequent memory effects’) (reviewed in Paller & Wagner, 2002; Rugg et al., 2002; Blumenfeld & Ranganath, 2006; Uncapher & Wagner, 2009). A recent across-study analysis of the encoding literature raised the possibility that PPC subsequent memory effects anatomically overlap with PPC correlates of attention (Uncapher & Wagner, 2009). Moreover, the majority of PPC foci demonstrating enhanced activity for events later remembered vs. forgotten (‘positive’ subsequent memory effects) localized to dPPC, whereas foci demonstrating enhanced activity for events later forgotten vs. remembered (‘negative’ subsequent memory effects) localized exclusively to vPPC.
These data motivated the ‘dual-attention encoding hypothesis’, which posits that top-down and bottom-up attention may differentially foster encoding success and failure, respectively (Uncapher & Wagner, 2009). This hypothesis builds on evidence that top-down attention enhances cortical representations of attended information (e.g., Corbetta et al., 1990; reviewed in Beck & Kastner, 2008; Noudoost et al., 2010), and thus may increase the probability that attended information ultimately projects to the medial temporal lobe (MTL) for encoding (e.g., Moscovitch & Umilta, 1990; Uncapher & Rugg, 2009). From this perspective, positive subsequent memory effects emerge when dPPC-mediated top-down attention is directed towards information that will be relevant to later remembering. By contrast, engagement of vPPC-mediated bottom-up attention may lead to later memory failure when attention is captured by information not relevant or helpful to later remembering (Otten & Rugg, 2001b; Wagner & Davachi, 2001; Cabeza et al., 2008; Uncapher & Wagner, 2009).
Because consideration of PPC attention mechanisms on event encoding has largely come from across-study comparisons, it remains unclear whether parietal fluctuations that predict the mnemonic fate of events reflect the influence of top-down and bottom-up attention on episodic encoding. Here we implemented a paradigm that provides independent assays of attention and encoding, within-subjects, to determine (a) whether dPPC and vPPC show attention-sensitive activity that is also predictive of subsequent memory success or failure, and (b) whether the manner in which these attention-sensitive regions dynamically interact with the rest of the brain predicts the mnemonic fate of an event.
Data are reported from 18 right-handed, native-English speaking volunteers (9 female; age range: 18–27 yrs; mean = 20.4, SD = 2.9), each of whom gave written informed consent in accordance with procedures approved by the Institutional Review Board of Stanford University. All participants were recruited from the Stanford community, reported no history of neurological trauma or disease, and were remunerated at a rate of $20/hr. Data from two additional participants were acquired but excluded from analyses, due to their failure to exhibit a behavioral reorienting effect (see below).
Stimuli consisted of 660 black-and-white line drawings of common objects and 80 artificial ‘greeble’ objects (Gauthier & Tarr, 1997). Common objects were drawn from the International Picture Naming Project (Szekely et al., 2004) and from a set previously used by Uncapher & Rugg (2009). Greebles and common objects were transformed into black-and-white line drawings in Photoshop CS v8.0 (Adobe Systems Inc.). For the purpose of counterbalancing, common objects were divided into 33 sets of 20 objects each, and greebles into five sets of 16 objects each. Object sets were rotated across study and test lists across subjects, and greebles across study lists. For each subject, 10 study lists were created from these sets, each containing 40 objects and 16 greebles. In each study list, 46 stimuli were presented in the cued location (‘valid’ trials, see below; 32 objects and 14 greebles) and the remaining 10 in the non-cued location (‘invalid’ trials; 8 objects and 2 greebles), yielding an 18% probability that a stimulus would be invalidly cued.
Of the 400 common objects encountered during the study phase, 350 were carried forward into the test phase (to minimize test list length). Thus, in each of 10 sessions, 35 common objects served as the critical study items (27 valid and 8 invalid). In the later memory test, 180 common objects served as foils. A separate set of 10 common objects and 3 greebles were used to create a practice study list.
Within the present task context, ‘top-down attention’ refers to the goal-directed allocation and maintenance of visuospatial attention evoked by a preparatory cue (described below), and ‘bottom-up attention’ refers to a shift of visuospatial attention evoked by an object occurring outside the focus of attention. We are agnostic as to whether bottom-up shifts of attention occur by means of automatic, involuntary, or ‘exogenous’ processes vs. goal-directed, voluntary, or ‘endogenous’ processes (for a review and discussion see Corbetta et al., 2008; see also Posner & Cohen, 1984; Downar et al., 2001; de Fockert et al., 2004; Kincade et al., 2005; Serences et al., 2005; Indovina & Macaluso, 2007; Burrows & Moore, 2009).
The experiment consisted of 10 scanned incidental study sessions followed by one non-scanned memory test. Each study session lasted approximately 6.5 min, and the memory test lasted approximately 40 min. The interval between the end of the study phase and the start of test phase was approximately 10–15 min. Participants received instructions and practiced the study task prior to entering the scan suite; study task performance was analyzed during this training session to ensure that participants understood and complied with the task instructions (see below). Moreover, eye movements were visually monitored by two investigators and feedback was given to the participant to ensure participants could perform the study task while maintaining fixation; participants were trained to ceiling level, continuing until no visually detectable saccades occurred at any point during the training session. We acknowledge that this training does not ensure that eye movements did not occur during scanning. However, given recent studies of overt vs. covert attentional shifts that demonstrate that the same fronto-parietal regions are engaged when eye movements occur and when they do not (e.g., Ikkai & Curtis, 2008), we believe the possible presence of eye movements in the present study does not alter the interpretation of the findings. We also note that eye movements were not recorded during scanning in many prior studies of Posner attention cueing, including the study to which we compare our attention-related effects (Corbetta et al., 2000).
During each study session, participants viewed a black screen containing two white boxes (subtending 5° horizontal and vertical angles) appearing to the right and left of a white central fixation crosshair (Fig. 1). The nearest edges of the boxes subtended 1° horizontal angle from central fixation. The beginning of each trial was indicated by the appearance of a green arrow cue, which replaced the central fixation for 1 s. The arrow pointed to the left or right, cuing subjects to covertly shift their attention (without moving their eyes) to the corresponding box. After a variable cue-to-stimulus interval (CSI)—either 1, 3, or 5 s—an object appeared for 500 ms in one of the two boxes. On 82% of trials, the object appeared in the cued box (‘valid’ trials), while on the remaining trials it appeared in the non-cued box (‘invalid’ trials). Upon appearance of the object, participants were to indicate whether the stimulus represented a real object or belonged to the artificial object class of greebles (index and middle finger key press, respectively); response hand was counterbalanced across subjects. Speed and accuracy were given equal emphasis in the task instructions. A variable intertrial interval (ITI) of 1.5, 3.5, or 5.5 s separated the offset of the object stimulus from the onset of the following cue. The variable CSIs and ITIs were pseudo-randomly distributed across trials such that the regressors in the General Linear Model (see fMRI data analysis) that estimated the neural responses to cues and objects were minimally correlated (< 0.13), thus allowing activity elicited by each phase in the trial (cue and object) to be independently assessed. Stimuli were presented in pseudo-random order, with no more than three trials of one item-type (objects/greebles) occurring consecutively. Stimuli were projected onto a mirror mounted on the MRI headcoil.
Immediately following the final study session, participants were transferred from the scanner to a neighboring testing suite and received instructions for the surprise memory test. The test list comprised 350 studied (old) and 180 unstudied (new) common objects, centrally presented, individually, in pseudo-random order. For each test object, participants were to indicate whether or not they had encountered the item in any of the study sessions, and to indicate their level of confidence. One of four responses (sure old, unsure old, unsure new, sure new) was made with the index or middle fingers of each hand. Old and new responses were made using separate hands, with high and low confidence indicated by middle and index fingers, respectively. The mapping of hands to old or new responses was counterbalanced across participants.
Each test trial began when the white fixation crosshair changed to red for 500 ms (Fig. 1), after which the test object was presented and remained onscreen until a response was made or until 4 s elapsed (if no response was made). If the object was judged to be new, the test advanced to the next trial. If the object was judged old, it remained onscreen for up to 4 more seconds, during which time participants attempted to recollect the location at which the object had appeared during study (cued with the appearance of a “Left?”, “Right?” prompt; Fig. 1). Participants made a left or right index finger button press to indicate memory for the object appearing on the left or right side of the screen, respectively, and made their best guess when uncertain. A 1.5 s ITI (displaying a white fixation crosshair) separated test trials. To mitigate fatigue, participants were given a self-paced break after each third of the test.
Whole-brain imaging was performed on a 3T Signa MR scanner (GE Medical Systems). Anatomical images were collected using a high-resolution T1 –weighted spoiled gradient recalled (SPGR) pulse sequence (130 slices; 1.5 mm thick; 256 × 256 matrix; .86 mm2 in-plane resolution). Functional images were obtained from 30 4mm-thick axial slices, aligned to the AC-PC plane and positioned to give full coverage of the cerebrum and most of the cerebellum, using a T2* –weighted two-dimensional gradient-echo spiral-in/out pulse sequence [repetition time (TR) = 2 s; echo time (TE) = 30 ms; flip angle 75°; 64 × 64 matrix; 3.44 mm2 in-plane resolution]. Data were acquired in 10 sessions, comprised of 190 volumes each. Volumes within sessions were acquired continuously in a descending sequential order. The first four volumes were discarded to allow tissue magnetization to achieve a steady state.
Data preprocessing and statistical analyses were performed with Statistical Parametric Mapping (SPM5, Wellcome Department of Cognitive Neurology, London, UK: http://www.fil.ion.ucl.ac.uk/spm/software/spm5/), implemented in MATLAB 7.7 (The Mathworks, Inc.). For each participant, all volumes were realigned spatially to the first volume, and then to the across-run mean volume. All volumes were corrected for differences in acquisition times between slices (temporally realigned to the acquisition of the middle slice). The anatomical volume was coregistered to the mean functional volume, and then a unified segmentation procedure (Ashburner and Friston, 2005) was applied to segment the anatomical volume into gray matter, white matter, and cerebrospinal fluid. The segmented images were deformed to probabilistic maps of each tissue type in Montreal Neurological Institute (MNI) space, and the resulting deformation parameters were applied to the functional images for normalization. Functional images were resampled into 3-mm3 voxels using non-linear basis functions (Ashburner and Friston, 1999). Functional images were concatenated across sessions. Normalized functional images were smoothed with an isotropic 8-mm full width half maximum (FWHM) Gaussian kernel.
Statistical analyses were performed in two stages of a mixed effects model. In the first stage, neural activity was modeled by a delta function (impulse event) at the onset of each cue and each stimulus. These functions were convolved with a canonical hemodynamic response function (HRF) and its temporal and dispersion derivatives (Friston et al., 1998) to yield regressors in a General Linear Model (GLM) that modeled the BOLD response to each event type. The two derivatives modeled variance in latency and duration, respectively. Analyses of the parameter estimates pertaining to the dispersion derivative of cue-related effects are reported below (parameter estimates pertaining to the derivatives of other items contributed no theoretically meaningful information beyond that contributed by the canonical HRF, and thus are not reported).
The timeseries were high-pass filtered to 1/128 Hz to remove low-frequency noise and scaled to a grand mean of 100 across both voxels and scans. As described below, parameter estimates for events of interest were estimated using one of two GLMs. Nonsphericity of the error covariance was accommodated by an AR(1) model, in which the temporal autocorrelation was estimated by pooling over suprathreshold voxels (Friston et al., 2002). The parameters for each covariate and the hyperparameters governing the error covariance were estimated using Restricted Maximum Likelihood (ReML). Effects of interest were tested using linear contrasts of the parameter estimates. These contrasts were carried forward to a second stage in which subjects were treated as a random effect. Unless otherwise specified, whole-brain analyses were employed when no strong regional a priori hypothesis was possible; in these cases, only effects surviving an uncorrected threshold of p < .001 and including four or more contiguous voxels were interpreted. When we held an a priori hypothesis about the localization of a predicted effect, we corrected for multiple comparisons by employing a small volume correction (SVC) using family-wise error (FWE) rate based on the theory of random Gaussian fields (Worsley et al., 1996). The peak voxels of clusters exhibiting reliable effects are reported in MNI coordinates.
Regions of overlap between the outcomes of two contrasts were identified by inclusively (or ‘conjunctively’) masking the relevant SPMs. When two contrasts are independent, the statistical significance of the resulting SPM can be computed using Fisher’s method of estimating the conjoint significance of independent tests (Fisher, 1950; Lazar et al., 2002). For each inclusive masking procedure, the constituent contrasts were determined to be independent with tests of orthogonality for linear contrasts. This test concludes that two linear contrasts are statistically independent if the sum of the products of the coefficients is equal to zero; i.e., for contrasts A and B: if a1b1 + a2b2 + a3b3 + a4b4 = 0, then contrasts are orthogonal. The goal of inclusive masking in the present study was to search within regions that exhibited one pattern of activity to identify whether any voxels also showed a second pattern. As such, we maintained the original threshold of the contrast identifying the first pattern (p < .001), using a threshold of p < .05 for the masking contrast, to give a conjoint significance of p < .0005. Exclusive (or ‘disjunctive’) masking was employed to identify voxels where effects were not shared between two contrasts. The SPM constituting the exclusive mask was thresholded at p < .10, whereas the contrast to be masked was thresholded at p < .001. Note that the more liberal the threshold of an exclusive mask, the more conservative is the masking procedure.
To identify the relationship between attention effects and encoding effects, we orthogonalized the two factors in our design: (a) top-down attention effects were identified by interrogating cue-related activity for all items, collapsing over subsequent memory status, (b) bottom-up attention effects were identified by comparing validly cued vs. invalidly cued items, and (c) positive and negative subsequent memory effects were identified by comparing subsequently remembered vs. forgotten items. To accommodate these analyses, two GLMs were estimated. The first was optimized to model stimulus-related effects (i.e., subsequent memory effects and bottom-up attention effects). Study trials were segregated according to subsequent memory [high confidence hits (HCH), low confidence hits (LCH), misses (M)] and cue validity (valid, invalid), resulting in six object-related regressors. Two additional regressors modeled greebles (validly and invalidly cued). Objects for which memory was not later tested were modeled as events of no interest, as were objects for which a response was omitted at test. Nine additional regressors modeled the cue-related activity associated with each of the aforementioned stimulus types. Finally, six regressors modeled movement-related variance (three rigid-body translations and three rotations determined from the realignment stage), and session-specific constant terms modeled the mean over scans in each session.
The second GLM implemented a parametric modulation analysis designed to detect top-down attention effects. Previous Posner cueing studies (e.g., Corbetta et al., 2000) identified top-down attention effects by isolating activity elicited by cues to shift covert attention. In order to separately estimate neural responses associated with cues vs. stimuli, prior studies employed ‘catch trials’ in which no stimulus appeared post-cue (e.g., Corbetta et al., 2000). Here we opted to de-correlate cue- and stimulus-related activity by parametrically varying the cue-to-stimulus interval (CSI: 1, 3, or 5 s). Furthermore, because cue-related activity could reflect not only top-down attention effects but also low-level visual responses to the cue itself, we used a parametric modulation analysis to distinguish activity transiently responding to the cue, from that responding across the duration the variable CSIs. In other words, by isolating BOLD responses that were elicited by the cue to shift attention and that were also sustained throughout the variable interval over which attention was to be maintained, we could rule out the possibility that any cue-related effects were simply a reflection of low-level visual responses. To accomplish this, we included an orthogonal variable in the GLM to isolate top-down effects, identifying cue-related activity that increased with increasing CSI duration. It should be noted that because the HRF effectively integrates the total activity over a period of a few seconds, parametric modulators identify activity that is modulated in duration or magnitude (or both). As we discuss below, the obtained data suggest that the parametric modulator captured variance in duration of the cue-related HRF. In sum, for this second GLM, one regressor modeled all cue-related activity, and a second regressor parametrically modulated the first according to the CSI following each cue. Thus, the second regressor for each subject in this GLM was of equal length to the first regressor and comprised a vector of values where each value represented the duration of the CSI for the corresponding cue. This regressor identified voxels in which cue-related activity varied as a linear function of the duration over which attention was to be maintained. To accommodate the remainder of the known variance, a third regressor modeled all stimulus-related activity (not segregated according to subsequent memory or validity), and movement and session effects were modeled as described above.
An alternative method of interrogating sustained BOLD responses across the CSI is to analyze effects associated with the dispersion derivative of the cue-related HRF. In the present design, cue-related attention effects (but not cue-related visual effects) might show a more sustained response, which would be captured by significant loading on the dispersion derivative. Accordingly, below we report the cue-related dispersion derivative outcome, which confirmed the parametric modulator outcome. However, we note that a dispersion derivative analysis is less optimal than a parametric modulation analysis for our design for two reasons. First, a dispersion derivative analysis can only accommodate variance in the duration of the HRF up to approximately 1 s, and here the CSI varied from 1, 3, or 5 s. Second, the differing durations of the CSI are suboptimally modeled by a dispersion derivative, which is fit across all trial types (i.e., 1, 3, and 5 s CSIs). This trial-by-trial variability is better modeled by a parametric modulation analysis (Henson, 2007).
Response profiles of regions of interest (ROIs) identified in map-wise analyses were investigated using the deconvolution algorithm implemented in MarsBar (marsbar.sourceforge.net). This algorithm deconvolves the BOLD signal in the ROI using a finite impulse response function, which assumes no shape for the hemodynamic response. From these analyses, the cluster-wise integrated percent signal change was extracted for further characterization of the functional response (see Results for details).
The second main goal of the present study was to determine whether the functional dynamics of the putative dorsal and ventral attention networks changed when an event memory was effectively formed relative to when one was not. An understanding of how these attention networks dynamically interact with other neural structures during the encoding of event information may inform how episodic memories are created. We therefore sought to determine whether neural components of the dorsal and ventral attention networks showed connectivity profiles that predicted later memory success or failure.
Multivariate connectivity analyses were conducted by submitting seed regions – ROIs identified in the univariate analyses as exhibiting top-down or bottom-up attention effects (see next paragraph) – to psychophysiological interaction (PPI) analyses to determine whether they showed memory-related functional connectivity with other regions in the brain (Friston et al., 1997). In this manner, we investigated whether the connectivity between attention-related regions and other brain regions differed as a function of encoding success or failure.
Seed regions were components of PPC that exhibited relevant attention effects: a left dPPC (mIPS/SPL) region that displayed top-down attention effects, and bilateral vPPC (TPJ) regions that displayed bottom-up attention effects (for details, see Results section, ‘Neural correlates of top-down and bottom-up attention’). Using standard PPI analysis techniques, seed clusters were individually defined for each subject on the basis of the random effects group analyses. For each subject, the data for each seed region was the principal eigenvariate of all significant (p < .05) voxels within a 4-mm sphere, centered on the local peak maximum that fell within 2*FWHM of the smoothing kernel (i.e., 16 mm) and was within the anatomical region of interest, identified from each subject’s normalized structural scan. These subject-specific timeseries represented the ‘physiological’ component of the PPI. All but one of the 18 subjects met all criteria for every seed region; PPI analyses were performed on these 17 subjects. The timeseries were adjusted for variance associated with effects of no interest, and then a deconvolution with the hemodynamic responses was performed. The resultant vector was weighted by a contrast vector representing the relevant ‘psychological’ factor (in this case, a subsequent memory contrast: HCH vs. M), and then reconvolved with the hemodynamic responses. The outcome of this process formed the ‘psychophysiological interaction’, or PPI regressor. This regressor models the between-condition difference in regression slopes between each voxel in the brain and the seed region. This PPI regressor was entered into a GLM, along with regressors modeling main effects of the psychological and physiological factors (i.e., the condition contrast vector and the timeseries, respectively). In line with the univariate GLMs, we additionally modeled movement-related effects as well as session-specific constant terms. Because PPI analyses are inherently less powered than their univariate counterparts (as only the unshared variance between the three regressors is attributed to the interaction term, or the PPI regressor), we adopted a whole-brain uncorrected threshold of p < .005, four-voxel extent. Voxels that surpassed threshold in these PPI analyses can be interpreted as showing a significant difference in connectivity with the attention-related seed region as a function of later memory outcome.
Participants performed at ceiling on the object/greeble discrimination task for items that were cued validly and invalidly (.98, SE = .0002 and .99, SE = .0002, respectively). All but two of 20 participants showed evidence of attentional reorienting, as indexed by longer response times (RTs) to items appearing in the invalidly cued relative to the validly cued location; as noted above, data from the two subjects failing to show this behavioral reorienting effect were omitted from all analyses. Data from the 18 participants submitted to analysis revealed that the reorienting effect (all items: F(1,17) = 21.00, p < .001; Table 1) was significant for both common objects (t17 = 5.97, p < .001) and greebles (t17 = 2.91, p < .01). The CSI between the orienting cue and the onset of the object did not impact the reorienting effect, as evidenced by the absence of an interaction between CSI and validity (F < 1).
Recognition memory was estimated by calculating the difference in the probabilities of an ‘old’ response to an old vs. a new item [Pr = (pOld | Old) − (pOld | New)]. Recognition memory performance was superior when participants were highly confident in their old/new decision (Prhi conf = .30, SE = .03) relative to when a low confidence response was given (Prlow conf = .07, SE = .01) (Prhi conf vs. Prlow conf, t17 = 5.75, p < .001). Analyses of d′, excluding three participants who had zero high confidence false alarms, resulted in the same pattern as the Pr analyses: high confidence d′= 1.25, SE = .11 vs. low confidence d′ = .29, SE = .08.
For studied items that were confidently endorsed as old, participants were well above chance (.5) when indicating the location in which the object was studied (‘source memory’ accuracy: mean = .71, SE = .03; t17 = 6.54, p < .001). This was not the case, however, when low confidence judgments were given (mean = .52, SE = .02; t17 < 1). Thus, because both recognition and source memory performance were poor for low confidence responses, the fMRI subsequent memory analyses focused on comparison of high confidence hits (HCH) vs. misses (M).
The need to reorient attention to invalidly cued objects at study negatively impacted subsequent memory for those objects (Table 2). Specifically, invalidly cued objects were later confidently recognized less often than were validly cued objects (t17 = 5.18, p < .0005). However, when invalidly cued objects were subsequently recognized with high confidence, source memory performance was equivalent to that for validly cued objects (t17 < 1). This finding suggests that the following neuroimaging comparisons of valid and invalid subsequent memory effects (HCH vs. M) were not biased in favor of conditions associated with superior source memory. Such a bias would be introduced, however, if subsequent memory analyses were expanded to include all recognized items (rather than restricted to HCHs), as source memory was better for all hits in the valid vs. invalid conditions (t17 = 1.9, p<.04). This finding of biased source memory for all hits, but unbiased source memory for high confident hits, reinforces the restriction of subsequent memory analyses to high confident hits. Finally, study task RTs conditionalized as a function of later memory performance revealed that study RTs did not differ as a function of subsequent memory (Table 1; valid HCH vs. M: t17 < 1; invalid HCH vs. M: t17 < 1). This finding rules out the possibility that the neural subsequent memory effects were simply a consequence of the amount of time spent initially processing study items (i.e., differential ‘duty cycles’).
The present experiment was designed to directly assess the degree to which top-down and bottom-up attention mechanisms contribute to the formation of event memories. Our analysis strategy first considered the factors of attention and encoding separately, and then examined the relationship between them by (a) investigating regional overlap between attention and encoding effects, and (b) investigating connectivity effects.
We first examined whether our attention paradigm gave rise to patterns of activity similar to those observed in prior Posner cueing studies (which primarily used detection tasks with repeated simple shapes, whereas our paradigm included a discrimination task and trial-unique meaningful objects). Based on prior studies, top-down attention effects were predicted to (a) be elicited by the arrow cues to shift attention, and (b) vary in duration according to the interval over which attention was maintained (the variable CSI) (see Experimental procedures). Supporting these predictions, a parametric modulation analysis revealed that cue-related activity (cue > fixation) varied positively according to CSI in the frontal eye fields (FEF), the left medial intraparietal sulcus (mIPS), extending into superior parietal lobule (SPL) (Fig. 2; Table 3). These findings are consistent with an extensive literature suggesting that these fronto-parietal regions form a ‘dorsal attention network’ (e.g., Corbetta et al., 1993; Nobre et al., 1997; Kastner et al., 1999; Hopfinger et al., 2000; Sylvester et al., 2007; reviewed in Corbetta et al., 2008).
Confirming the outcome of the parametric modulation analysis, analysis of non-modulated cue effects (i.e., the canonical cue-related HRF) revealed the same set of regions (bilateral FEF and left mIPS/SPL). Notably, this contrast also identified an additional set of regions in visual cortex (bilateral visual cortex, centered on [12 −99 6] and [−18 −93 6], Z = 4.17), consistent with the idea that the non-modulated effects reflect a combination of low-level visual and high-level attention-related processes. Finally, to determine which non-modulated effects exhibited a sustained response, we inclusively masked the non-modulated effects (p < .001) with the dispersion derivative contrast (p < .05). The outcome of this procedure revealed 34 voxels in mIPS/SPL (centered on [−24 −57 54], Z = 3.66) that showed a more sustained response than the canonical cue-related HRF, further confirming the findings from the parametric modulation analysis.
To identify neural correlates of bottom-up attention, objects appearing in unexpected locations were contrasted with those appearing in expected locations (i.e., invalidly > validly cued objects). Multiple regions were more active when the object was invalidly cued, including bilateral temporoparietal junction (TPJ) (Fig. 2; Table 3). This pattern is consistent with the proposal that TPJ is a key component of a ‘ventral attention network’ that mediates stimulus-driven reorienting of attention (Corbetta et al., 2008). Additional regions revealed in this contrast included FEF and IPS (Fig. 2; Table 3), a finding consistent with the hypothesis that stimulus-driven reorienting of attention triggers recruitment of the dorsal attention network (Corbetta et al., 2002; Giessing et al., 2006; Corbetta et al., 2008; Shulman et al., 2009). That is, effects revealed by this contrast may reflect the consequences of a stimulus-driven salience calculation (mediated in part by TPJ), which in turn serves to drive shifts in the locus of visuospatial attention (partially mediated by an FEF-IPS/SPL network) (e.g., Burrows & Moore, 2009; Shulman et al., 2009).
It is notable that these ‘validity effects’ appear to overlap with the CSI-varying cue-related effects in medial—but not lateral—IPS (Fig. 2, yellow). This apparent dissociation would extend recent evidence suggesting that lateral and medial IPS functionally differ (Nelson et al., 2010), with mIPS differentially tracking demands on top-down attention (e.g., Hutchinson et al. 2009; Sestieri et al., 2010; Uncapher et al., 2010). Here, mIPS was engaged during the components of the Posner task that are thought to recruit top-down attention (most directly evidenced by the parametrically modulated cue-related contrast, and indirectly evidenced by the validity contrast, for reasons described in the previous paragraph). Interestingly, this mIPS/SPL region appears to anatomically overlap with that revealed in a recent meta-analysis of the top-down attention literature (compare the present mIPS/SPL top-down effects in Fig. 2, yellow+green, to the top-down attention ALE map in Fig. 1 of Uncapher et al., 2010).
Importantly, the apparent selectivity of the present top-down effects to mIPS/SPL, not including lateral IPS, was confirmed by a disjunction analysis (i.e., identification of regions that exhibit significant effects in one contrast but not in the other). To identify regions that exhibited validity effects but not CSI-varying cue-related activity, we masked the validity contrast (p < .001) with the parametrically modulated cue-related contrast (thresholded at a lenient level, p < .10, to create a stringent mask). Importantly, left lateral IPS survived this disjunction analysis, indicating that it exhibited validity effects but no evidence of cue-related activity. We also performed the reverse disjunction analysis, masking the cue-related contrast (p < .001) with the validity contrast (thresholded at a lenient level, p < .10). This procedure did not reveal a disjunction in left mIPS/SPL (nor in FEF), suggesting that medial (and not lateral) parietal subregions are engaged during both the cue-related and validity contrasts (again, this finding is compatible with the view that top-down attention mechanisms can be triggered by stimulus-driven salience that drives the reorienting of attention). Together, these findings support the hypothesis that lateral and medial IPS functionally differ, with mIPS/SPL differentially tracking demands on top-down attention.
Prior studies identifying the neural correlates of encoding have contrasted study items that were subsequently remembered vs. forgotten (e.g., Brewer et al., 1998; Wagner et al., 1998; Henson et al., 1999; for reviews, see Wagner et al., 1999; Paller & Wagner, 2002; Spaniol et al., 2009). Accordingly, we first analyzed stimulus-related activity on trials most analogous to prior subsequent memory experiments, namely trials on which the object appeared in an expected location (validly cued trials), and was later remembered with high confidence (HCH) vs. later forgotten (M). Consistent with the literature on MTL and PPC encoding effects (for respective reviews, see Davachi, 2006; Uncapher & Wagner, 2009), this contrast revealed positive subsequent memory effects (HCH > M) in multiple regions, including bilateral hippocampus and right posterior IPS (Fig. 3; Table 4). Also evident were bilateral clusters that encompassed fusiform and lateral occipital (LO) cortex (Fig. 3); these ventral temporal-occipital foci appear to include the lateral occipital complex (LOC), which is implicated in visual object representation (e.g., Malach et al., 1995; Grill-Spector, Kourtzi, Kanwisher, 2001; Grill-Spector and Malach, 2004). In support of this interpretation, an analysis of common objects vs. greebles identified a large swath of activity overlapping these fusiform and LO subsequent memory effects (available from the corresponding author upon request). Additionally, a positive subsequent memory effect was evident in left ventrolateral prefrontal cortex (VLPFC), a region consistently associated with episodic encoding success for verbalizable and/or meaningful stimuli (e.g., Kirchhoff et al., 2000; Baker et al., 2001; Davachi et al., 2001; Otten & Rugg, 2001a; Clark & Wagner, 2003). [Note that subsequent memory effects were computed by comparing all HCHs (i.e., collapsed across source memory accuracy) to misses; to identify effects of source memory, we compared HCHs with and without source memory in the subset of participants (n = 15) with five or more trials in each condition. No region demonstrated greater encoding activity for HCHs with source memory at standard statistical thresholds (p < .001), whereas activation in right TPJ was greater during HCHs without source memory. Because these source memory analyses were underpowered, however, interpretative caution is warranted.]
Positive subsequent memory effects are often accompanied by negative subsequent memory effects, or regions that show an enhanced response to items later forgotten relative to items later remembered. Consistent with prior reports (e.g., Otten and Rugg, 2001b; Wagner and Davachi, 2001; Daselaar et al., 2004; Gonsalves et al., 2004; Reynolds et al., 2004; Turk-Browne et al., 2006; Chua et al., 2007; Otten, 2007; Park and Rugg, 2008), negative subsequent memory effects (M > HCH) were observed in right vPPC, including TPJ (~BA 40) and in medial aspects of parietal and prefrontal cortex (Fig. 3; Table 4). These negative subsequent memory effects, as well as the preceding positive subsequent memory effects, were associated with object-related activity (analogous analyses of cue-related activity did not reveal significant correlates of subsequent memory at standard statistical thresholds; c.f., Otten et al. 2006).
Stimulus-related signals that vary with subsequent memory may reflect a variety of processes. We therefore adopted a functional-localizer logic, using the top-down attention contrast to constrain the process space being investigated. To do so, we looked within the regions engaged during top-down attention (operationalized as parametrically modulated cue-related activity) to assess whether stimulus-period activity in these regions tracked subsequent memory. In other words, to determine whether the neural correlates of top-down attention overlapped with those of successful episodic encoding, we employed a conjunction analysis between the regions identified by the parametrically modulated cue-related contrast and those exhibiting positive object-related subsequent memory effects. Specifically, the cue-related contrast (thresholded at p < .001) was inclusively masked with the positive subsequent memory contrast for validly cued objects (thresholded at p < .05). This conjunction analysis (at a conjoint threshold of p < .0005) revealed that the bilateral FEF and left mIPS/SPL regions that exhibited a top-down cueing effect were again engaged when objects appeared (post-cue), and this engagement was predictive of subsequent memory (Fig. 4, green; Table 3, denoted in final column). [Note that additional analyses comparing HCHs with and without source memory failed to reveal a difference in encoding activation in left mIPS/SPL and FEF. Again, due to concerns about power, future studies are needed to determine whether left mIPS/SPL activation during encoding is specifically predictive of memory for item-location associations and/or is related to item memory.]
Negative subsequent memory effects in TPJ have been posited to reflect the disruptive consequences of the bottom-up capture of attention during encoding, perhaps marking the diversion of attention by irrelevant event information (Otten & Rugg, 2001b; Wagner & Davachi, 2001; Uncapher & Wagner, 2009). Here, we explicitly tested this account by examining the relationship between the parietal structures engaged during the bottom-up capture of attention (i.e., greater activation to invalidly cued relative to validly cued objects) and those demonstrating negative subsequent memory effects for validly cued objects. Importantly, inclusive masking of the contrast identifying bottom-up attention effects (thresholded at p < .001) with that revealing negative subsequent memory effects (thresholded at p < .05) revealed overlap in bilateral TPJ (Fig. 4, blue; Table 3, denoted in final column). This pattern suggests that, at least for validly cued objects, greater TPJ activation may reflect the capture of bottom-up attention by irrelevant event features. Qualitative inspection of the observed hemodynamic responses in bilateral TPJ (Fig. 4) further indicated that TPJ responded maximally to items that appeared outside the current focus of attention (invalidly cued objects) and were later forgotten, and the least to items that appeared where expected (validly cued objects) and were later remembered. This observation of a negative subsequent memory effect in TPJ on invalidly cued trials – wherein bottom-up attention was presumably captured by the to-be-encoded object – was unexpected, as the capture of bottom-up attention by such objects was predicted a priori to foster their encoding. Accordingly, we next explored the subsequent memory effects on invalid trials in greater detail.
Prior subsequent memory studies have investigated encoding correlates for stimuli appearing in expected locations. In most cases, the spatial contingency was limited to one location (central fixation); in others, this contingency extended to multiple locations (e.g., Cansino et al., 2002; Sommer et al., 2005a, 2005b; Uncapher et al., 2006; Uncapher & Rugg, 2009). To our knowledge, no study has examined whether encoding mechanisms are similar or different for stimuli presented outside vs. inside the current focus of top-down attention. The present design—in which spatial expectations were probabilistically confirmed (validly cued objects) or violated (invalidly cued objects)—provided leverage on this question, though we note that this analysis is inherently underpowered due to the relatively lower frequency of invalidly cued objects (with an average of 62 trials contributing to subsequent memory analyses of invalid objects vs. 214 for valid objects). [Note that the limited number of invalid trials precluded subsequent source memory analyses from being conducted.]
To complement the preceding subsequent memory analysis for validly cued objects, we first identified positive and negative subsequent memory effects for invalidly cued objects [i.e., (HCH > M)invalid and (M > HCH)invalid]. A positive subsequent memory effect for invalidly cued objects was observed in right fusiform (Fig. 5A; Table 4); the absence of other positive correlates of encoding at the standard statistical threshold (p < .001) likely is due to low power. A follow-up statistically independent analysis of % signal change for this cluster further revealed a modest, but significant, positive subsequent memory effect for validly cued objects (t17 = 2.34; p < .02). Strikingly, the negative subsequent memory analysis revealed unexpected effects in dPPC, including in right IPS and SPL (Fig. 5B; Table 4; similar effects were observed in left IPS/SPL when the statistical threshold was relaxed to p < .005). This was surprising given the finding from a recent meta-analysis that all prior negative subsequent memory effects in parietal cortex have been reported in vPPC, with none in dPPC (Uncapher & Wagner, 2009). Even in the few studies that have directly manipulated goal-directed attention in a subsequent memory paradigm (Kensinger et al., 2003; Uncapher & Rugg, 2005, 2008, 2009), in none of these was dPPC activity reported to be associated with subsequent forgetting.
We next sought to identify regions where subsequent memory effects differed according to whether objects appeared in expected relative to unexpected locations. Voxel-level interaction analyses [i.e., (HCH > M)invalid > (HCH > M)valid, and vice versa] revealed that, while no regions showed positive encoding-related activity that was greater for invalid than valid trials (even when the threshold of the interaction was dropped to p < .01), several regions—including the right IPS and SPL regions identified above—exhibited the opposite pattern (valid > invalid subsequent memory effects; Fig. 5B; Table 5). Qualitative inspection of the signal in both regions (Fig 5C) revealed a striking pattern of findings in SPL: a positive subsequent memory pattern for valid trials, reversing to a negative subsequent memory pattern for invalid trials. This reversal of positive to negative effects for items appearing inside vs. outside the focus of attention suggests that SPL may serve to bias the processing of information appearing in the cued location, at the expense of information appearing in the non-cued location. In so doing, SPL-mediated spatial attention processes may facilitate and hinder encoding, respectively. Thus this reversal of positive to negative subsequent memory effects perhaps explains why dPPC negative subsequent memory effects have not been reported in prior studies, as none have systematically manipulated whether objects appeared inside vs. outside the current focus of attention.
To evaluate the hypothesis that the reversal from positive to negative subsequent memory effects in SPL reflects the engagement of top-down attention, we inclusively masked the interaction contrast with the cue-related parametric analysis described above (p < .001 and p < .05, respectively). This procedure confirmed that a subset of the SPL cluster (6 of 26 voxels) showed both effects. Similarly, when we performed the reverse masking procedure (inclusively masking the cued-related parametric analysis with the validity × subsequent memory interaction analysis), a large proportion of the cluster (81 of the 125 voxels) exhibited both effects. Thus, the deployment of top-down attention appears to interact with stimulus location to either promote memory (if the stimulus appears in the cued location) or hinder memory (if it appears in the non-cued location).
One interpretation of the foregoing findings is that the more top-down attention is deployed to the expected location (as indexed by top-down attention effects in mIPS/SPL), the greater the reflexive reorienting response when the item appears in the unexpected location (as indexed by bottom-up attention effects in TPJ); under such circumstances, the redeployment of attention may be ineffective in promoting stimulus encoding (as indexed by negative subsequent memory effects in TPJ and SPL during invalid trials). Consistent with the former hypothesis, prior data indicate that expectations about where stimuli will appear modulate the magnitude of TPJ reorienting responses. For instance, Vossel and colleagues (2006) reported TPJ modulation according to two different levels of cue validity (cues were either 90% or 60% predictive of the location of the upcoming target), with a robust TPJ validity effect (invalid>valid activity) being observed in the high expectancy condition, but little to no effect being observed in the low expectancy condition.
To test the hypothesis that the more top-down attention is deployed to the expected location, the greater the activity in TPJ when a stimulus appears in the unexpected (vs. expected) location, we examined whether a relationship existed between mIPS/SPL top-down attention effects and TPJ validity effects. To do so, we regressed the extracted % signal change for each effect in each subject, and found a significantly positive correlation across subjects (right TPJ: r = .52, p < .02; left TPJ: r = .45, p < .03). In other words, those subjects that showed the greatest CSI-varying cue-related activity in mIPS/SPL also showed the greatest difference in TPJ activity when items appeared in the unexpected vs. expected location. To identify whether this correlation between regions also had a direct consequence for memory formation (as described above), we next performed a multi-linear regression analysis on the miss rate of items studied in the unexpected location. This analysis revealed a trend towards higher correlations between mIPS/SPL and TPJ effects and higher miss rates for items in the unexpected location (F = 3.262, p < .07), suggesting that those subjects exhibiting the greatest reorienting response in TPJ to items in the unexpected location (presumably due to allocating more top-down attention to the expected location, as indexed by mIPS/SPL activity) also had more difficulty effectively encoding items in unexpected locations into memory (as indexed by higher miss rates for these items).
The foregoing univariate analyses revealed that regions exhibiting top-down and bottom-up attention effects were also differentially associated with encoding success and failure. To understand the dynamic nature of this relationship between attention and memory formation, we next sought to characterize how attention-related parietal regions interact with the rest of the brain during the formation of event memories. To this end, we implemented psycho-physiological interaction (PPI) analyses to identify neural structures whose connectivity with attention-related parietal regions changed in a memory-related fashion. It should be noted that all PPI analyses use independent factors to identify and interrogate the data; in other words, we used the factor of attention to identify our seed ROIs, and then searched for regions whose connectivity with these seeds differed according to subsequent memory.
First, we identified which regions throughout the brain that the mIPS/SPL region exhibiting top-down attention effects in our univariate analyses interacted with during the preparatory (cue) period. Specifically, the analysis identified regions where connectivity with mIPS/SPL differed during trials on which the object was subsequently remembered vs. forgotten (‘subsequent memory connectivity effects’). Several regions exhibited such effects, i.e., stronger connectivity with mIPS/SPL during preparatory periods wherein the subsequently presented object was later remembered relative to forgotten (Fig. 6A; Table 6), including the left LO/fusiform region that showed a positive subsequent memory effect when processing validly cued objects (Fig. 3). Thus, stronger preparatory coupling between mIPS/SPL regions that mediate top-down attention and LO/fusiform regions that represent visual object form appears to facilitate encoding of subsequently encountered objects. Consistent with this interpretation, an across-subject regression revealed that subjects who showed stronger subsequent memory connectivity between mIPS/SPL and LO/fusiform (during the cue period) tended to demonstrate superior recognition memory at test (i.e., exhibited higher Pr for high-confidence judgments) relative to subjects who showed weaker connectivity effects (r(15) = .414, p < .05; Fig. 6A).
In addition to connectivity increases during the preparatory period that tracked later memory outcome, mIPS/SPL also exhibited connectivity decreases with other regions (i.e., showed weaker connectivity during trials where the object was later remembered vs. forgotten). These regions included a large cluster in right angular gyrus (AnG; Fig. 6A and Table 6). AnG is regarded as a key parietal component in the default mode network, which is thought to be involved in internally oriented or self-related cognition (for review, see Buckner et al., 2008, Bressler and Menon, 2010). Thus, one interpretation of this greater mIPS/SPL-AnG connectivity during trials where objects are later forgotten vs. remembered is that directing attention internally prior to the external presentation of a stimulus may be detrimental to the encoding of that information.
We next investigated whether the bilateral TPJ regions exhibiting bottom-up attention effects (during the stimulus-processing period) showed connectivity that differed according to later memory. We therefore looked for subsequent memory connectivity effects using the right and left TPJ regions as seeds. We found no regions that showed positive subsequent memory connectivity effects with the right TPJ seed and only a few that did so with the left TPJ seed (Table 6). By contrast, the TPJ seeds showed robust negative subsequent memory connectivity effects with a broad set of regions (Table 6). Strikingly, the LO/fusiform region that showed positive connectivity effects with the mIPS/SPL seed showed the opposite pattern of connectivity with TPJ (Fig. 6B). The TPJ seeds also showed negative subsequent memory connectivity effects with bilateral regions encompassing parahippocampal and fusiform cortices (Fig. 6B); parahippocampal cortex has been associated with the spatial encoding of objects (reviewed in Eichenbaum & Lipton, 2008). Thus, successful object encoding appears to be associated with reduced coupling between TPJ regions (implicated in reflexive orienting) and LO/fusiform regions (implicated in visual object representation), as well as with parahippocampal regions (implicated in the encoding of objects in space).
Eye movements were not monitored in the scanner, raising the possibility that the observed subsequent memory effects could, in theory, reflect differential BOLD responses associated with eye movement patterns (e.g., objects fixated might be associated with more effective encoding and thus a higher likelihood of being subsequently remembered). While the absence of eye-tracking data precludes a definitive assessment of this possibility, we believe it unlikely for a number of reasons. First, it is unlikely that extensive eye movements occurred during scanning, because participants were able to maintain accurate fixation during the behavioral training session, and an extensive literature suggests that participants tend not to saccade when trained to covertly orient (Posner & Cohen, 1980). Second, if eye movements did occur, to the extent that differences in fixating the to-be-remembered object account for differences in subsequent memory outcome, one might predict that RTs during the object discrimination task would be faster for objects in fixation relative to those peripherally viewed. However, RTs were comparable during the encoding of later remembered and later forgotten stimuli (see Behavioral performance and Table 1). Third, subsequent memory effects have been consistently observed in IPS across studies in which all stimuli fall at the center of the visual field (for review, see Uncapher & Wagner, 2009). Under such conditions, subjects learn that the center of the visual field is task/goal-relevant and it is to their advantage to fixate this region to effectively perceive and make decisions about presented stimuli. Fourth, even under conditions where stimuli are presented rapidly in the center of the visual field, making eye movements suboptimal and therefore unlikely, subsequent memory effects are observed in IPS. For instance, Otten and Rugg (2001a) identified subsequent memory effects in IPS/SPL for items presented for 300 ms, with no RT differences between items that were later remembered vs. forgotten.
Finally, we note that studies that have systematically mapped the effects of eye movements or overt shifts of attention have consistently identified bilateral parietal activity, corresponding to saccades to contralateral visual hemifields (reviewed in Silver & Kastner, 2009). Our critical analysis (overlap of top-down attention effects and subsequent memory effects) identified unilateral activity in left IPS (Fig. 4). This lateralization of effects was not due to thresholding, as the effects remained left-lateralized when the threshold of the attention effects was reduced from p < .001 to a liberal value of .01. Given that the items were presented with equal probability to the two sides of the screen, it seems unlikely that fluctuations in the magnitude of the unilateral IPS effects reflect eye movements to both sides of the screen. We would note that this pattern of left-lateralized IPS top-down attention effects has also been found in prior Posner studies, where eye movements were monitored during scanning and found to be negligible (e.g., Doricchi et al., 2010).
The univariate data demonstrated a relationship between neural correlates of attention and episodic encoding of objects: top-down attention effects and encoding success effects overlapped in dPPC (mIPS/SPL), and bottom-up attention effects and encoding failure effects overlapped in vPPC (TPJ). The deployment of top-down attention interacted with stimulus location to either promote memory (if the stimulus appeared in the cued location) or hinder memory (if it appeared in the non-cued location). The multivariate connectivity findings suggest that episodic encoding of objects is influenced by parietal interactions with regions representing visual object information (LO/fusiform), with a positive influence from top-down attention-related regions (mIPS/SPL) and a negative influence from bottom-up attention-related regions (TPJ).
Based on prior observations of encoding success effects localized predominantly to dPPC and encoding failure effects localized exclusively to vPPC, we previously speculated that top-down and bottom-up attention may have distinct influences on event encoding (Uncapher & Wagner, 2009). The present study directly tested this ‘dual-attention encoding hypothesis’ by manipulating top-down and bottom-up attention during an incidental encoding paradigm. Three main findings advance understanding of how attention regulates memory formation. First, for objects appearing in expected locations, dPPC regions engaged during the controlled allocation of visuospatial attention were positively correlated with episodic memory formation, whereas vPPC regions engaged during stimulus-driven attentional capture were negatively correlated with memory formation. Second, we provide novel evidence that the deployment of top-down attention is not always beneficial for encoding, as the dorsal attention network was also associated with encoding failure when objects appeared outside the focus of attention. Finally, connectivity analyses revealed that, during the formation of memories for objects, top-down and bottom-up attention appear to have opposite influences on perceptual cortical areas that subserve visual object representation, suggesting that one manner in which attention modulates memory is by altering the perceptual processing of to-be-encoded stimuli.
We propose that the observed overlap between neural correlates of top-down attention and encoding success in dPPC is indicative of the extent to which the deployment of top-down attention supports memory encoding. In other words, the overlap of effects may reflect engagement of top-down attention mechanisms during the study task, which in turn increases the probability that attended information will progress through the cortical hierarchy to converge on MTL mechanisms for encoding into memory.
Top-down attention can enhance the firing rate of neurons representing goal-relevant elements of an experience (e.g., Desimone & Duncan, 1995; McAdams & Maunsell, 1999; Treue & Martinez-Trujillo, 1999; reviewed in Treue, 2001; Boynton, 2005), and attention is thought to influence between-region communication by altering oscillatory coupling (Saalmann et al., 2007; Gregoriou et al., 2009a), preparing regions to receive input at the most excitable phase of their oscillatory activity (Gregoriou et al., 2009b). The hippocampus, by virtue of its apical position in the neural processing hierarchy (Felleman & Van Essen, 1991), is proposed to be the recipient of cortically processed event features (reviewed in Eichenbaum et al., 2007), and cortical–hippocampal coupling is associated with successful episodic encoding (e.g., Fell et al., 2001). By facilitating the cortical representations of goal-relevant stimuli, top-down attention—partially subserved by an FEF-mIPS/SPL network—could foster the propagation of higher fidelity representations to the hippocampus, resulting in a higher probability of stimulus encoding into memory (‘biased input hypothesis’; Uncapher & Rugg, 2009).
The present data lend support for this perspective. First, activation in mIPS/SPL exhibited a top-down attention effect during the preparatory period, and predicted later memory success during the object processing period. Second, activity in this mIPS/SPL region predicted response profiles in object-sensitive regions of ventral temporo-occipital cortex, namely LO and fusiform. Importantly, this functional coupling was stronger during preparatory periods of trials for which the subsequently presented object was later remembered vs. later forgotten, and the strength of this mIPS/SPL–LO/fusiform coupling difference correlated with across-subject differences in later memory performance. Finally, activity in bilateral fusiform, LO, and hippocampus was greater during the viewing of objects that would be later remembered vs. forgotten. Collectively, these findings are consistent with a role of top-down attention in promoting encoding by (a) preparing object representation processes in LO/fusiform for incoming information, with a stronger drive promoting better encoding, and (b) fostering effective propagation of these putatively higher-fidelity visual representations to the hippocampus. Future studies using intracranial electrocorticography (e.g., Jacobs & Kahana, 2010), with independent electrodes in LO/fusiform and in hippocampus, will provide a means to test the role of attention-mediated oscillatory coupling in visual object encoding into episodic memory.
The dual-attention encoding hypothesis posits that memory failure can result when attention is captured by event information that is not the target of a subsequent retrieval attempt. That is, if attention is shifted away from information that will be the target of subsequent retrieval, encoding will be hindered for this information. Here, the overlap between bottom-up attention and negative subsequent memory effects for validly cued objects in bilateral TPJ lends support for the proposal that attention capture can hinder event encoding. Factors that engage the ventral attention network are debated, but at least include an unexpected or salience dimension (e.g., Posner & Cohen, 1984; Jonides & Yantis, 1988). One possibility is that attention was captured by non-experimental variables (e.g., unexpected change in scanner noise, urge to move, or internally-oriented cognition) or stimulus-related variables (e.g., oddly shaped or shaded object), consistent with data indicating that TPJ is engaged by not only spatial but also nonspatial (feature-based) information (e.g., Linden et al., 1999; Marois et al., 2000; Braver et al., 2001; Kiehl et al., 2001; Serences et al., 2005). Thus, while future studies are needed to gain leverage on the nature of the attention-capturing information leading to subsequent forgetting of target objects, the present data provide evidence that a reorienting mechanism in TPJ is directly associated with later forgetting.
Intuitively, it is easy to understand that attentional capture of extra-stimulus information would draw processing resources from the target stimulus, leading to poorer memory for the stimulus. What is less clear is what happens when the attention-capturing information is stimulus-related. Recent studies reveal an equivocal pattern: while items that are distinctive in some way (and therefore attention-capturing) often enjoy a mnemonic advantage (von Restorff, 1933), this is not always the case. Strange and colleagues (2000) reported an interaction of the ‘von Restorff effect’ and depth of processing. Perceptually distinctive items were better remembered only if studied under ‘shallow’ encoding conditions (where superficial aspects of the items were emphasized) (see also Fabiani et al., 1990), whereas emotionally distinctive items were always remembered better. Importantly, semantically distinctive items were remembered worse when attention was paid to that dimension during study. Summerfield and Mangels (2006) additionally showed that items appearing at unpredictable times (distinctive in the temporal dimension) were more poorly remembered than predictable items.
Here we examined the consequences on memory when items appeared in an unexpected location. Like the semantic and temporal dimensions, items that appeared in an unexpected location suffered a mnemonic disadvantage. As in previous studies, this may be due to an interaction between our encoding task and the attention-capturing dimension. If attention to the unexpected spatial location occurred at the expense of some other memory-promoting dimension, such as semantic elaboration, memory for the objects would suffer. Another explanatory factor may be the degree to which attention-capturing information is used as a retrieval cue. To the degree that spatial information is a poor retrieval cue, attention being captured by the spatial dimension at study may hinder later memory performance.
The present findings also advance understanding of how memories are formed for objects appearing outside the current focus of attention (i.e., invalidly cued objects). Our data revealed that, rather than recruiting a new set of mechanisms, the mechanisms associated with encoding objects outside attentional focus—in fusiform cortex—are a subset of those engaged when objects are within attentional focus. Strikingly, presenting objects outside attentional focus also revealed a novel negative subsequent memory effect in the dorsal attention network (SPL). To date, activation in this top-down network has been exclusively associated with subsequent memory success, rather than failure (Uncapher & Wagner, 2009). This negative effect reversed to a positive effect when objects appeared in the expected location, suggesting that top-down biasing of attention interacts with expectation to either promote or hinder memory formation. Thus, previous findings that the top-down attention network exclusively promotes memory encoding may be due to a lack of expectation violations in prior studies. Collectively, the present findings highlight the role of perceptual cortices in visual object encoding, and the modulatory influence that attention has on activity in these regions.
The present data provide strong evidence that top-down and bottom-up attention mechanisms in parietal cortex influence episodic encoding, with mIPS/SPL-mediated top-down attention generally serving to promote memory formation, and TPJ-mediated bottom-up attention serving to hinder memory formation (at least within the present experimental context). Here we offer a mechanistic explanation for how to reconcile effects of attention that may appear inconsistent across the literature (Uncapher & Wagner, 2009); namely, that the presence of positive or negative subsequent memory effects in dorsal or ventral PPC can be predicted based on whether one exerts experimental control over the focus of attention. These findings of direct overlap between attention and episodic encoding appear to stand in contrast to the pattern observed during episodic retrieval, where the overlap between parietal attention and memory effects is a topic of current debate (Hutchinson et al., 2009; Uncapher et al., 2010; c.f., Cabeza et al., 2008; Ciaramelli et al., 2008). As such, the present data highlight the importance of attention during event processing for later remembering (e.g. Craik et al., 1996).
Supported by grants from the National Institute of Mental Health (5R01–MH080309, F32–MH084475).