|Home | About | Journals | Submit | Contact Us | Français|
Recognition memory is thought to consist of two component processes – recollection and familiarity. It has been suggested that the hippocampus supports recollection, while adjacent cortex supports familiarity. However, the qualitative experiences of recollection and familiarity are typically confounded with a quantitative difference in memory strength (recollection > familiarity). Thus, the question remains whether the hippocampus might in fact support familiarity-based memories whenever they are as strong as recollection-based memories. We addressed this problem in a novel way using the Remember/Know procedure where we could explicitly match the confidence and accuracy of Remember and Know decisions. As in earlier studies, recollected items had higher accuracy and confidence than familiar items, and hippocampal activity was higher for recollected items than for familiar items. Furthermore hippocampal activity was similar for familiar items, misses, and correct rejections. When the accuracy and confidence of recollected and familiar items were matched, the findings were dramatically different. Hippocampal activity was now similar for recollected and familiar items. Importantly, hippocampal activity was also greater for familiar items than for misses or correct rejections (as well as for recollected items vs. misses or correct rejections). Our findings suggest that the hippocampus supports both recollection and familiarity when memories are strong.
The acquisition of declarative memory depends on the integrity of the medial temporal lobe (the hippocampus, the dentate gyrus and subicular complex, together with entorhinal, perirhinal, and parahippocampal cortices). One of the most widely studied examples of declarative memory is recognition memory--the ability to judge an item as having been encountered previously. There is broad agreement that recognition memory consists of two distinct components, recollection and familiarity (Atkinson & Juola, 1974; Mandler, 1980). Recollection involves remembering specific details about the learning episode. Familiarity refers to remembering that an item was encountered previously, but without the ability to identify any information about the learning episode.
There has been considerable interest in the neuroanatomy of recollection and familiarity and particularly in the possibility that structures within the medial temporal lobe might differentially and uniquely support these functions. For example, it has been proposed that recollection depends on the hippocampus and familiarity on the adjacent perirhinal cortex (for reviews see Brown & Aggleton, 2001; Diana et al., 2007; Eichenbaum et al., 2007; Skinner & Fernandes, 2007). In lesion studies as well as in neuroimaging studies, a number of methods have been used to separate recollection and familiarity, including high or low confidence ratings, the presence or absence of source recollection, and Remember or Know judgments (Squire et al., 2007).
In fMRI studies, a common finding has been that hippocampal activity is higher for recollection-based decisions than for familiarity-based decisions (Cansino et al., 2002; Daselaar et al., 2006; Eldridge et al., 2000; Montaldi et al., 2006; Otten, 2007; Yonelinas et al., 2005). In addition, familiarity-based decisions often do not appear to engage the hippocampus. Thus, hippocampal activity is often no different when items are recognized based on familiarity than when items are not recognized as having appeared on an earlier list (Davachi et al., 2003; Eldridge et al., 2000; Montaldi et al., 2006; Vilberg & Rugg, 2007).
The interpretation of this rather consistent picture is complicated by the fact that the methods used to differentiate recollection from familiarity also invariably differentiate strong memories from weak memories. That is, recollection-based decisions are typically associated with higher confidence and/or higher accuracy than familiarity-based decisions (even though this need not be the case). For example, old/new judgments made with high confidence, old judgments made in association with correct source judgments, or Remember judgments are all made with higher confidence and/or higher accuracy than judgments made with low confidence, with incorrect source judgments, or Know judgments (high vs. low confidence, Mickes et al., 2007; Reed et al., 1997; correct vs. incorrect source, Gold et al., 2006; Slotnick & Dodson, 2005; Remember vs. Know, Dunn, 2004, 2008; Rotello & Zeng, 2008; Wixted & Stretch, 2004).
In a recent attempt to address this strength confound (Cohn et al., 2009), participants were instructed to make a high-confidence Remember judgment when details about a previously presented item were recollected and to make a high-confidence Familiar judgment (equivalent to a high-confidence Know judgment) when no details about the item could be recollected. However, the behavioral accuracy scores computed from hit and false alarm rates indicate that Remember judgments and high-confidence Familiar judgments differed considerably in memory strength despite the intention to avoid this problem (Wixted et al., 2010).
We have addressed this problem in a novel way using the Remember/Know procedure. Participants were first asked to make an old/new judgment for each studied item according to a 1-20 confidence scale and were then asked to judge each item according to whether it was Remembered, Known, or was a Guess (Rotello & Zeng, 2008; Wixted & Mickes, 2010). We then assessed brain activity before and after explicitly matching the confidence and accuracy of Remember and Know decisions.
Sixteen right-handed volunteers (7 female; mean age = 27; range = 19 – 36) recruited from the University community gave written informed consent before participation and were compensated monetarily.
The stimuli were 360 nouns with a mean frequency of 27 (range 1-191) and concreteness rating > 500 (mean = 573) obtained from the MRC Psycholinguistics Database (Wilson, 1988). Five 60-word lists were used for study, and one 60-word list provided foils for the retrieval test. The ratio of targets to foils (300:60) maximized the number of trials available for the primary analyses (Remember hits, Know hits, and Misses). An equal number of targets and foils would have made the scan time excessively long (> 2 hr). The assignment of study lists to the study and retrieval test conditions was randomized across participants. All words were presented in black font on a white background.
Before scanning, participants saw 300 words and were told that their memory would be tested. They made a pleasant/unpleasant judgment for each word (2.5-s presentation time, 500-ms intertrial interval) by pressing one of two marked buttons on a laptop computer keyboard (Figure 1). The study session was divided into four equal blocks of 75 trials with short breaks between blocks.
Following the study session (about 20 min), participants took a memory test in the MRI scanner for 300 target words and 60 foil words. Participants were scanned in 6 separate runs (~2-min delay between runs), such that each run contained 50 target words and 10 foils. For each word, participants made an old/new recognition judgment (4 s/word) using a 20-point scale (1 = definitely new, 20 = definitely old) (Figure 1). Participants were instructed to use the entire 20-point scale. For words declared old, participants judged whether the word was recollected, was familiar, or was a guess (2 s/word), following a modified Remember-Know-Guess procedure (Rajaram, 1996; Wixted & Mickes, 2010). The modified instructions emphasized that participants should use the Remember response only if they could actually describe specific details about the experience of studying the word. They were told that they should use the Know response if they thought the word was familiar but could not recollect any details of their encounter with the word.
Participants made their responses by moving the cursor of an MRI-compatible mouse (Current Designs, Philadelphia, PA) to the appropriate location on the screen (i.e., a number from 1 through 20 and the words Remember, Know, and Guess) (Figure 1). An odd/even digit task (Stark & Squire, 2001) was intermixed with word presentation and served as a baseline against which the hemodynamic response was estimated. For the digit task, participants saw a digit from 1 to 9 (1.75 s duration followed by a 0.25 s interval) and indicated whether the digit was odd or even by moving the mouse cursor. Each scan run began with 5 digit trials and ended with 7 digit trials. After the presentation of each word, 0 to 7 digit trials were given (101 total digit trials per scan run). Words were more likely to be followed by few digit trials (e.g., 0, 1, or 2 trials) than many digit trials (e.g., 5, 6, or 7 trials). The mean intertrial interval between words was 3.4 s (range = 0 – 14 s). Participants were given a short practice block before scanning to ensure that they understood the task and how to use the MRI-compatible mouse.
For all behavioral responses, the vertical position of the mouse cursor was fixed over the response options and the cursor could be moved only in the left-right direction. The starting position of the cursor was randomized across trials. In the event that participants made an erroneous response, they were instructed to indicate the error by pressing the mouse button on the subsequent trial. Three participants made erroneous responses during testing (mean for these participants = 5.3 trials; range 1 – 11). Trials with erroneous responses were discarded.
Imaging was carried out on a 3T GE scanner at the Center for Functional MRI (University of California, San Diego). Functional images were acquired using a gradient-echo, echo-planar, T2*-weighted pulse sequence (2000 ms TR; 30 ms TE; 90° flip angle; 64 × 64 matrix size; 25 cm field of view). The duration of the experiment within each scan run varied according to the number of old and new responses participants made (i.e., trials for words designated as old lasted 6 s, whereas trials for words designated as new lasted 4 s). If participants finished the experiment before the maximum number of MR volumes had been acquired (281 MR volumes; i.e., the number of MR volumes needed if they had indicated every word was old), they viewed a screen instructing them to take a break from the task until the scanner stopped (~ 30 s). Any MR volumes acquired during the break period were excluded from analysis. The first five MR volumes acquired were discarded to allow for T1 equilibration. Thirty-six oblique coronal slices (slice thickness = 4.8 mm) were acquired perpendicular to the long axis of the hippocampus and covering the whole brain. Following the six functional runs, high-resolution structural images were acquired using a T1-weighted IR-SPGR pulse sequence (25.6 cm field of view; 8° flip angle; 2.9 ms TE; 172 slices; 1.0 mm slice thickness; 256 × 256 matrix size).
fMRI data were analyzed using the AFNI suite of programs (Cox, 1996). Functional data were corrected for field inhomogeneities with field mapping data collected before functional scanning, coregistered in three dimensions with the whole-brain anatomical data, slice-time corrected, and coregistered through time to reduce effects of head motion. Large motion events, defined as MR volumes in which there was > 0.3° of rotation or > 0.6 mm of translation in any direction were excluded from the deconvolution analysis by censoring the excluded time points but without affecting the temporal structure of the data. We also excluded the MR volumes immediately preceding and following the motion-contaminated MR volumes.
Behavioral vectors were created that coded each retrieval trial according to the old/new status of the word and the old/new judgment to create four categories: hits [correct “old” responses (11-20) to a target], misses [incorrect “new” responses (1-10) to a target], correct rejections [correct “new” responses (1-10) to a foil], and false alarms [incorrect “old” responses (11-20) to a foil]. Two separate models were created. For the first model (N=16), the vectors for the hits were divided further into Remember hits, Know hits, and Guess hits, collapsing across memory confidence (mean number of trials for Remember hits = 148.7 ± 10.9; mean number of trials for Know hits = 89.2 ± 9.4; mean number of trials for Guess hits = 29.7 ± 6.3). A second model was created to equate accuracy and confidence for Remember hits and Know hits (N=16). For each participant, trials from one or more levels of confidence were combined so that the average accuracy was exactly the same or similar for Remember hits and for Know hits. Trials were combined from the highest confidence level(s) possible until there were sufficient trials for estimating the hemodynamic response function (mean number of trials for Remember hits = 119.9, range = 25 – 231; mean number of trials for Know hits = 35.7; range = 14 – 107). From the full set of hits, 80.5% of the Remember hits and 39.6% of the Know hits were used to create Strong Remember and Strong Know conditions. The remaining hit trials with lower confidence were modeled for Remember hits and for Know hits to create Weak Remember and Weak Know conditions. As is typically the case, there were very few trials in the Weak Remember condition (29.1 trials on average, and 8 of the 16 participants had fewer than 10 trials). In contrast, there were ample weak Know hits for fMRI analysis (mean number of trials for Weak Know hits = 53.9; range = 11 – 127).
The behavioral vectors, six vectors that coded for motion (three for translation and three for rotation), and three polynomial vectors that coded for linear, quadratic, and cubic drift in the MRI signal were used in deconvolution analyses of the fMRI time series data. The deconvolution method does not assume a shape of the hemodynamic response, and the fit of the data to the model was estimated for each time point independently (0 - 14 s after trial onset). The resultant fit coefficients (β coefficients) represent activity versus baseline in each voxel for a given time point and for each of the response categories. For comparisons that involved only words designated as old (e.g., Remember hits vs. Know hits), this activity was summed over the expected hemodynamic response (2-14 s) and taken as the estimate of the response (relative to the digit task baseline).
The trial length was shorter for words designated as new (correct rejections and misses) than for words designated as old (hits and false alarms)(4 s vs. 6 s, respectively), because the Remember/Know/Guess judgment was omitted for words designated as new. Accordingly, for comparisons that involved new judgments (e.g., Remember or Know hits vs. Misses), the β coefficients for both response categories were summed over the first 2-8 s of the modeled hemodynamic response. Note that when all analyses were limited to 2-8 s of the modeled hemodynamic response (not just analyses involving new judgments), our main findings (i.e., Figures 4 and and6)6) remained the same.
Initial spatial normalization was accomplished using each participant’s structural MRI scan to transform the data to the atlas of Talairach and Tournoux (1988). Statistical maps were also transformed to Talairach space, resampled to 2 mm3, and smoothed using a Gaussian filter (4 mm FWHM) that respected the anatomical boundaries of the several MTL regions defined for each individual participant (see below). Specifically, the smoothing was carried out within each of the anatomically defined MTL regions, but smoothing was not extended beyond the edges of these regions in order to prevent activity from one region (e.g., parahippocampal cortex) from being blurred into another, adjacent region (e.g., hippocampus). This was accomplished by creating a separate mask for each region, smoothing the data within that mask, and then recombining the smoothed data. The Talairach-transformed data were used in the whole-brain analyses. Anatomical regions were manually segmented in 3D on the Talairach-transformed anatomical images for the hippocampus, temporal polar, entorhinal, perirhinal, and parahippocampal cortices on each side. Temporal polar, entorhinal, and perirhinal cortices were defined according to the landmarks described by Insausti et al. (1998a). The caudal border of the perirhinal cortex was defined as 4 mm caudal to the posterior limit of the gyrus intralimbicus as identified on coronal sections (Insausti et al., 1998a). The parahippocampal cortex was defined bilaterally as the portion of the parahippocampal gyrus caudal to the perirhinal cortex and rostral to the splenium of the corpus callosum (Insausti et al., 1998b).
We used a recent instantiation of an ROI alignment technique (ROI-ANTS; Yassa et al., 2010; Lacy et al., 2011) to optimally align regions of the medial temporal lobe across participants (Yassa & Stark, 2009). This method uses Advanced Normalization Tools (ANTs), which implements SyN (symmetric normalization), a powerful diffeomorphic registration algorithm (Klein et al., 2009). A customized anatomical space was constructed based on the Talairach-transformed structural scans from the 16 participants in the study. Each participant’s grayscale scan and hand-drawn ROI segmentations of the hippocampus were used simultaneously to warp the structural scan into the customized anatomical space (Yushkevich et al., 2009).
Parameter estimate maps for each participant were entered into group-level analyses and in all cases thresholded at a voxel-wise p-value of p < 0.01. For the MTL analyses, group statistic maps were masked to include only regions of the MTL. A cluster correction technique was used to correct for multiple comparisons in all group-level analysis, and Monte Carlo simulations (AlphaSim software) were used to determine how large a cluster of voxels was needed to be statistically meaningful (p < 0.05) (Forman et al., 1995; Xiong et al., 1995). Within the volume of the MTL the minimum cluster extent was 17 contiguous voxels and for the volume of the entire brain the minimum cluster extent was 48 voxels.
Participants distributed their responses over the entire 1-20 scale (Figure 2). High-confidence responses (ratings of 19 and 20) were primarily associated with Remember judgments, but high-confidence Know judgments were abundant as well. Guess judgments were predominantly associated with lower confidence ratings (ratings of 11 and 12).
Accuracy (percent correct = 100 * hit rate / [hit rate + false alarm rate]) and confidence were higher for words designated as Remember (97.1 ± 1.4% correct and 19.5 ± 0.2 confidence rating) than for words designated as Know (74.1 ± 3.8% correct and 17.4 ± 0.4 confidence rating) (Figure 3; ps < .001). Accuracy and confidence were lower for words designated as Guess (33.5 ± 3.3% correct and 12.9 ± 0.3 confidence rating) than for words designated as either Remember or Know (ps < .001). The hit rates for Remember, Know, and Guess judgments were 0.50, 0.30, and 0.10, respectively, and the false alarms rates were 0.02, 0.12, and 0.17.
The first analysis of brain activity followed procedures that have been used previously with similar data (Eldridge et al., 2000; Montaldi et al., 2006; Yonelinas et al., 2005). First, we looked for clusters in the medial temporal lobe where activity for Remember hits was higher than for Know hits. Three clusters were identified, including left hippocampus/parahippocampal cortex, right hippocampus, and left temporopolar cortex (Figure 4A, 4B; Table 1). Next, in separate analyses, we directly compared Remember hits and Know hits to Misses and Correct rejections (the two latter representing responses where participants reported no experience of a memory). Activity for Remember hits was higher in left hippocampus than activity for Misses (Figure 4C) or Correct rejections (Table 1). In contrast, there were no clusters detected in the medial temporal lobe when comparing activity for Know hits to either Misses (Figure 4D) or Correct rejections (Table 1). The findings illustrated in Figure 4 replicate what has been reported previously in fMRI studies of Remembering and Knowing (Eldridge et al., 2000), as well as in other similar studies (Montaldi et al., 2006; Yonelinas et al., 2005).
One might conclude from our findings that activity in hippocampus is related to recollection but not to familiarity. Note, however, that words designated as Remember and words designated as Know differed not only with respect to the reported presence or absence of recollection but also with respect to the strength of the memory (Figure 3). Specifically, words designated as Remember were recognized with both high accuracy and high confidence (accuracy > 95% correct, confidence rating > 19), whereas words designated as Know were recognized with lower accuracy and lower confidence (accuracy < 75% correct, confidence rating < 18). In order to compare Remember and Know responses without this difference in memory strength, we matched the accuracy and confidence ratings associated with Remember hits and Know hits (Figure 4). These we termed Strong Remember responses (97.7 ± 1.4% correct and 19.5 ± 0.2 confidence rating) and Strong Know responses (94.4 ± 2.6% correct and 19.2 ± 0.3 confidence rating).
After memory strength was matched for Remember hits and for Know hits, we repeated the original data analyses. These analyses yielded strikingly different results than what was found before memory strength was matched. First, we found no clusters in the medial temporal lobe where activity was higher for Remember hits than for Know hits (i.e., Strong Remember vs. Strong Know; Figure 6A; Table 1). Second, clusters were now identified where activity was higher for Know hits than for Misses or Correct rejections. Specifically, a comparison of Strong Know hits and Misses identified a cluster in left hippocampus/entorhinal/perirhinal/parahippocampal cortex (Figure 6C). In addition, a comparison of Strong Know hits and Correct rejections identified clusters in bilateral hippocampus and left parahippocampal cortex (Table 1). Lastly, we compared Strong Remember hits to Misses and Correct rejections. For these comparisons (Figure 6B and Table 1), we identified the same clusters as in the original analyses (Figure 4C, Table 1).
It is worth pointing out that perirhinal cortex activity was unique to the Strong Know vs. Miss contrast. Specifically, the Strong Know vs. Miss contrast yielded significantly more activity in left perirhinal cortex than the Strong Remember vs. Miss contrast (440 μl; −31, −21, −24). This finding is interesting in light of the putative role of perirhinal cortex in familiarity-based responses (Brown & Aggleton, 2001; Diana et al., 2007; Eichenbaum et al., 2007; Skinner & Fernandes, 2007).
Our findings suggest that activity in the medial temporal lobe identified earlier in the contrast between Remember hits and Know hits (Figure 4A, 4B) is related to differences in memory strength and not to the presence of recollection. If so, one might expect to obtain similar findings in the medial temporal lobe in other comparisons involving conditions that differ in memory strength. Accordingly, we compared Strong Know hits to Weak Know hits. These two response categories differed considerably in memory strength (Strong Know = 95.9 ± 2.1% correct and 19.2 ± 0.3 confidence rating; Weak Know = 69.7 ± 4.7% correct and 16.2 ± 0.3 confidence rating). As expected, there was considerable overlap between the regions identified in this new contrast (Strong Know hits vs. Weak Know hits, Table 1) and the regions identified in the original Remember hits vs. Know hits contrast (Figure 4A, 4B). Specifically, brain activity was higher for Strong Know judgments than for Weak Know judgments in both the hippocampus bilaterally as well as parahippocampal cortex bilaterally and left perirhinal cortex. A similar analysis involving Remember judgments could not be carried out because there were an insufficient number of Remember judgments made with low confidence (see Experimental Procedures: fMRI Data Analysis).
Note that when we substantially relaxed the voxel-wise threshold for the comparison of Strong Remember hits vs. Strong Know hits, we were able to detect a cluster in left hippocampus that exhibited higher activity for Strong Remember judgments than for Strong Know judgments (voxelwise p value = 0.15 uncorrected). However, at the same threshold we also detected a cluster in right hippocampus and bilateral clusters in parahippocampal gyrus that exhibited the opposite pattern: higher activity for Strong Know judgments than for Strong Remember judgments. These opposite patterns of results are what might be expected if the threshold is too low (i.e., the statistical test would be expected to yield false positives).
In summary, when memory strength was high for items designated as Remember and lower for items designated as Know, several regions in the medial temporal lobe exhibited higher activity for Remember responses than for Know responses (Figure 4A, 4B). However, when memory strength was high for items designated as Know (and equivalent in memory strength to the items designated as Remember), no regions in the medial temporal lobe were identified in this same contrast (Figure 6A). Furthermore, when memory strength was high for items designated as Remember and also high for items designated as Know, brain activity in the medial temporal lobe was higher for both Remember and Know responses than for Misses or Correct rejections (Figures 6B, 6C).
Although no regions in the medial temporal lobe distinguished items designated as Remember from items designated as Know after memory strength was matched, a number of neocortical regions were identified, i.e., higher activity for Strong Remember hits than for Strong Know hits (Figure 7; Table 2). These regions were left anterior cingulate and superior frontal gyrus, right medial frontal and rectal gyri, left posterior cingulate/precuneus, left angular gyrus, and bilateral cuneus. Other neocortical regions, as well as right thalamus and bilateral cerebellum, exhibited the opposite pattern, i.e., higher activity for Strong Know hits than for Strong Remember hits (see Figure 7 and Table 2).
We assessed brain activity when recognition memory judgments were associated with recollection and when these judgments were associated with familiarity. Participants first judged each studied item according to the confidence of their old/new decision (1 = definitely new, 20 = definitely old). For words judged old (ratings of 11-20), participants then decided whether the item was Remembered, Known, or was a Guess. The behavioral results yielded the expected memory-strength difference between Remember judgments (average accuracy = 97% correct, average confidence = 19.5), thought to denote recollection, and Know judgments (average accuracy = 74% correct, average confidence = 17.5), thought to denote familiarity. When correct Remember judgments were compared to correct Know judgments, brain activity was detected in several regions of the medial temporal lobe, including hippocampus (Figure 4A, 4B, Table 1). Moreover, hippocampal activity was higher for correct Remember judgments than for misses (Figure 4C) or correct rejections (Table 1), but no activity in the hippocampus or elsewhere in the medial temporal lobe was detected in similar comparisons between correct Know judgments and misses (Figure 4D), or correct rejections (Table 1). These findings replicate what has been reported in other, similar studies, and they appear to support the view that activity in the hippocampus is related to recollection but not to familiarity (Eldridge et al., 2000; Montaldi et al., 2006; Yonelinas et al., 2005).
Yet, words designated as Remember and words designated as Know differed not only with respect to the reported presence or absence of recollection but they also differed with respect to the strength of the memory itself. When the strength difference was removed by equating both the accuracy scores and the confidence ratings for items given Remember and Know judgments, the results were dramatically different. First, regions in the medial temporal lobe no longer distinguished Remember judgments from Know judgments (though regions of neocortex, especially in the frontal lobe, did make this distinction). Second, hippocampal activity was higher for both Remember judgments and Know judgments than for misses or correct rejections. These findings suggest that the hippocampus is associated with elevated activity for both recollection-based and familiarity-based decisions, contrary to what has often been concluded in the past.
It is important to emphasize that our findings do not imply (and we do not claim) that hippocampal activity associated with Strong Know responses must be equivalent to hippocampal activity associated with Strong Remember responses (or that the activity associated with any particular strength of Know responses should be equivalent to the activity associated with a similar strength of Remember responses). What our findings do imply (and what we do claim) is that a familiarity signal (Strong Know > Weak Know; Strong Know > Miss; Strong Know > Correct Rejections) is evident in the hippocampus when Know judgments are as strong as Remember judgments.
Our findings further suggest that familiarity, like recollection, must be sufficiently strong before elevated activity will be observed in the hippocampus (Song et al., 2011). Strong familiarity is associated with those Know judgments that are made with both high confidence and high accuracy. One possible concern is that high-confidence Know judgments actually reflect recollection and are not based strictly on familiarity. In fact, considerable evidence shows that, although Know judgments are associated with much less recollection than Remember judgments, they are almost always associated with some amount of recollection (e.g., Eldridge et al., 2005; Wais et al., 2008). However, even high-confidence Know judgments are associated with the same small amount of recollection that is typically associated with weak Know judgments. For example, Wixted and Mickes (2010) used the same method as ours but also tested for the presence of recollection objectively using source memory questions (e.g., questions about the location and/or color of the study word when it was presented on a computer screen). For Know judgments, item recognition accuracy increased from 71% to 75% to 88% correct as item confidence increased from 15-16 to 17-18 to 19-20, respectively (on a 1-20 scale). Importantly, source accuracy for Know judgments was low and did not change as a function of item confidence (source accuracy was 58%, 57%, and 58% correct, respectively, as item confidence increased from 15-16, to 17-18, to 19-20). See Ingram, Mickes and Wixted (in press) for a similar result using a different Remember/Know procedure.
Thus, high-confidence Know responses are like lower-confidence Know responses in that they contain little recollection (though some small amount of recollection can always be detected in conjunction with Know responses). Accordingly, strong Know judgments achieve their high strength because they involve a greater degree of familiarity – not a greater degree of recollection – than weak Know judgments. The point is that high-confidence Know responses are associated with no more recollection than occurs in the Know condition of typical Remember/Know studies where one finds that hippocampal activity is no different for Know judgments, misses and correct rejections.
A number of fMRI studies have used a modified Remember/Know procedure to compare recollection-based decisions with high-confidence, familiarity-based decisions (Cohn et al., 2009; Yonelinas et al., 2005). However, in these studies, as in other Remember/Know studies, the recollection condition involved considerably stronger memory than the familiarity condition. In similar study, Montaldi et al. (2006) used a novel (familiarity-only) procedure to try to minimize the contribution of recollection. Participants studied complex visual scenes and were instructed (and trained) to scan them superficially. After a 2-day delay, a recognition memory test was administered in the scanner. Participants were asked to rate the items for familiarity and to avoid effortful recollection (but to nevertheless report recollection if it did occur). “Old” decisions were made using a rating scale of F1, F2, F3 (low, medium and high degrees of familiarity, respectively) and R (recollection). Hippocampal activity was selectively elevated for the few R judgments that occurred and not for F3 judgments. Montaldi et al. (2010) recently undertook a new analysis of accuracy scores for F3 and R in that study and reported them to be similar (88% correct and 89% correct, respectively). Yet, it is important to note that the two hippocampal clusters identified in that study were found only when a more lenient threshold was used than was used for all the other data analyses. Furthermore, the clusters were quite small (one contained 2 voxels and the other contained 6 voxels) and no correction for multiple comparisons was used. Additional work is merited with this particular method before concluding that strong, familiarity-based memory does not yield hippocampal activity. In any case, our findings suggest that under typical experimental conditions – the conditions used in most studies using the Remember/Know procedure – strong familiarity-based memory is associated with elevated activity in the hippocampus.
If the hippocampus did not support familiarity but only supported recollection, then patients with hippocampal lesions would be expected to commonly experience a strong sense of familiarity, i.e., a sense of having encountered an item previously even though nothing specific about a prior encounter can be recalled. This is a familiar experience for most people (for example, recognizing someone’s face with certainty without being able to recollect anything about the person). Still, the experience is rather rare because, ordinarily, a strong sense of familiarity is accompanied by recollection. However, if hippocampal lesions selectively impair recollective ability, then this experience should not be rare for patients with such lesions. In a recent study designed to document the frequency of this phenomenon, i.e., how often hippocampal patients experience strong, familiarity-based recognition in the absence of recollection, the finding was that, if anything, the phenomenon occurs less often in patients than in controls (Kirwan et al., 2010). This finding from patients is consistent with the idea that the hippocampus supports familiarity (as well as recollection), and it accords with the neuroimaging data presented here that strong, familiarity-based recognition memory is associated with elevated activity in the hippocampus.
No medial temporal lobe structures distinguished items designated as Remember from items designated as Know, after memory strength was matched. These findings are consistent with the idea that the processes of recollection and familiarity are both supported by the hippocampus. The findings are also consistent with a single-process view that draws no distinction between recollection and familiarity. However, consistent with a dual-process view, a number of neocortical regions did discriminate between these processes. In particular, dorsomedial and dorsolateral prefrontal cortex exhibited more activity for Strong Remember judgments than for Strong Know judgments (Figure 7, Table 2). These findings are consistent with two neuroimaging studies that also associated recollection with frontal cortical activity after memory strength was accounted for (Kirwan et al., 2008; Wais et al., 2010). In addition, our findings are concordant with a substantial neuropsychological literature linking frontal lobe function to source memory, free recall, and other measures of recollection (Janowsky et al., 1989; Moscovitch & Winocur, 1995; Wheeler et al., 1995).
Our results weigh against the suggestion that the hippocampus does not support familiarity, and they help to explain why so many findings have seemed to suggest otherwise. Specifically, prior studies using the Remember/Know procedure (as well as other procedures) have almost invariably compared strong memories (Remember) to weak memories (Know). It may be the case with fMRI that elevated activity in the hippocampus is unlikely to be detected when memory is weak (Song et al., 2011). The fact that a memory strength confound can explain why earlier Remember/Know studies failed to detect familiarity-based activity in the hippocampus should not be taken to mean that “memory strength” is a concept that usefully informs the functional organization of medial temporal lobe structures. Nor should our findings suggest that any observed functional differences between medial temporal lobe structures will disappear once memory strength is equated. The point instead is that the role of the hippocampus in familiarity-based recognition has been obscured by a methodological confound. The fact that hippocampal activity is associated with both recollection and familiarity once memory strength is equated at a high level suggests that the functional organization of the medial temporal lobe will be best understood in terms unrelated to the distinction between recollection and familiarity (Wixted & Squire, 2011).
This work was supported by the Medical Research Service of the Department of Veterans Affairs, NIMH (MH24600), the Metropolitan Life Foundation, and P50AG005131. We thank Anna van der Horst, Jennifer Frascino, Annette Jeneson, Zhuang Song, Ashley Knutson, and Craig Stark for assistance.
Christine N. Smith, Department of Psychiatry, University of California, San Diego, CA 92093.
John T. Wixted, Department of Psychology, University of California, San Diego, CA 92093.
Larry R. Squire, Departments of Psychiatry, Neurosciences, and Psychology, University of California, San Diego, La Jolla, CA 92093; Veterans Affairs San Diego Healthcare System, San Diego, CA 92161.