This study examined the ability of AD patients and healthy older adults to monitor and judge the likely accuracy of recognition judgments and source judgments about who spoke something earlier. Participants listened to statements at encoding that were presented by a woman and a man, and at a subsequent test phase, participants provided confidence ratings about the likely accuracy of their recognition judgments (i.e., was this statement encountered during the encoding phase or not?) and source judgments (i.e., was this statement presented by the man or the woman?). Consistent with prior findings, we observed that AD patients showed worse recognition performance and source identification performance than did a group of older adults who experienced the identical study and test conditions (e.g., Budson et al., 2006
; Dalla Barba et al., 1999
; Multhaup & Balota, 1997
). Moreover, repetition of the study material greatly improved recognition performance on the part of the AD patients but had little effect on source identification performance. This result is consistent with previous suggestions that recollection may be more disrupted than familiarity in AD patients to the extent that our old-new recognition test is influenced by a mixture of familiarity and source recollection whereas our source identification test is influenced primarily by source recollection since both sources were equally familiar (e.g., Budson et al., 2000
; Dalla Barba, 1997
; Gallo et al., 2004
; Knight, 1998
; Souchay & Moulin, 2009
; Westerberg et al., 2006
). In addition, we observed that both groups of AD patients were very much impaired at monitoring the accuracy of both recognition and source identification judgments.
There are two striking findings about both groups of AD patients. First, they were worse at monitoring the likely accuracy of recognition and source judgments even when compared to a group of older adults (i.e., the older-m group) who showed comparable recognition and source identification accuracy. For instance, for both recognition and source judgments shows that the average confidence rating assigned by the AD-m patients to correct and incorrect judgments was nearly identical – indicating a nearly complete lack of awareness for the likely accuracy of correct and incorrect responses. By contrast, older adults on both tests were much more confident about the likely accuracy of correct than incorrect responses. The fact that these monitoring differences exist between the two groups – the AD-m group and the Older-m group – that show comparable recognition and source identification accuracy means that the AD-m patients’ monitoring problems are not caused by a disproportionate difficulty remembering the information because there were no differences in accuracy between these older adults and AD patients. The second striking finding is that even when AD patients were given extra exposure to the study material, which produced a dramatic improvement in recognition performance (i.e., in compare the recognition scores between the AD-m and the AD groups), there was little improvement in their ability to use confidence ratings to monitor the likely accuracy of these recognition judgments. For instance, shows that the average confidence rating assigned by the AD-m and the AD groups to Hits and Misses was nearly identical. Finally, for both recognition judgments and source judgments, our results indicate that the monitoring impairment in AD patients is actually worse than their memory impairment (or at least conditions that are able to improve AD patients’ memory are not able to improve monitoring performance), as otherwise there would have been no differences between the healthy older and the AD groups in monitoring performance when there were no differences in accuracy.
However, there is an interesting pattern on the part of the AD patients in that they show a severe monitoring impairment at the level of the item but not so much at the level of the task. As we have reviewed, at the level of the individual response, both groups of AD patients appear nearly completely unaware of the likely accuracy of a particular response. By contrast, at the level of the task, AD patients seem aware that the recognition judgment is an easier task than the source identification judgment. Specifically, both groups of AD patients were more accurate on the recognition task than on the source identification task and they provided significantly higher confidence ratings for the recognition task than for the source task. Two conclusions can be drawn from this pattern. First, AD patients do not show complete memory-monitoring anosognosia. Second, because confidence is tracking accuracy at the level of the task we can conclude that AD patients are not using the confidence rating scale in a haphazard manner and show some awareness that a higher confidence rating means a higher likelihood of being correct. But, it remains for future research to investigate why AD patients show this memory-monitoring dissociation between responses to particular items and overall responses to one task or another.
Our study suggests that there are two different mechanisms that contribute to the monitoring impairment in AD patients. First, AD patients appear to have a deficit in monitoring episodic memory judgments that contributes to their worse monitoring performance (e.g., worse Somers’ d scores) on both the recognition and the source identification tasks. Second, AD patients appear particularly vulnerable to making high-confidence errors on the source identification task that additionally contributes to their monitoring impairment on this task.
To explain the monitoring behavior of Alzheimer’s patients we propose a Remembrance-Evaluation model of monitoring. This is a two-stage model that builds on the ideas from Johnson and colleagues’ Source-Monitoring Framework, Schacter and colleagues’ Constructive Memory Framework, and Koriat’s Accessibility Model of monitoring (e.g., Johnson, Hashtroudi & Lindsay, 1993
; Schacter, Norman & Koutstaal, 1998
; Koriat, 1993
). The first, Remembrance, stage of the model refers to remembered information, and as in the models of Johnson, Schacter, Koriat and their respective colleagues, a confidence rating or a monitoring judgment is based, in part, on the kind and amount of remembered information. For our purposes, remembered information can lead to monitoring failures primarily when the remembered information is illusory and false but it elicits the same kind of subjective experience as true memories that are assigned high confidence responses. These high confidence misrecollections will affect monitoring performance on recollective-based tests, such as cued-recall or source memory tests. We suggest that it is these misrecollections that are causing the occurrence of high-confidence errors by the AD patients on the source judgment.
The second stage of our model is the evaluation process. Once information is remembered, it is necessary to consciously evaluate this information with the purpose of both making a memory judgment (e.g., does the kind and amount of remembered information allow for a judgment of “old”) and making a confidence rating (or an analogous assessment) about the likely accuracy of this memory judgment. Consequently, this evaluation stage can produce monitoring failures on different kinds of memory tests when individuals use inappropriate criteria to evaluate the remembered information, such as inappropriately weighting the presence or absence of particular kinds of memorial information. This inappropriate evaluation will in turn produce an inappropriate judgment (e.g., confidence rating) about the likely accuracy of a response. To simplify the model, we are assuming that a malfunctioning evaluative process will comparably affect the evaluation of different kinds of memorial information (e.g., familiarity, perceptual information, etc.). However, given the studies showing that AD patients appear normal at monitoring the accuracy of responses to general knowledge questions (e.g., Backman & Lipinska, 1993
), a distinction should be made between episodic and semantic information with AD patients showing preserved evaluative processes of semantic information and impaired evaluative processes of episodic information. Additional support for this episodic-semantic distinction comes from Reggev et al. (2011)
who show that different brain regions are associated with monitoring these different kinds of memory. The key point about a malfunctioning evaluation process is that it can contribute to a pattern of impaired monitoring on different kinds of episodic memory tests, which would produce the monitoring impairment by the AD patients on both the recognition and source memory judgments.
Overall, then, the crux of the Remembrance-Evaluation model is that monitoring impairments can be caused by two very different mechanisms: 1) remembered information that is distorted and illusory; and 2) evaluative processes that are inappropriate. The particular signature of the monitoring impairment will depend on which or both of these mechanisms that is malfunctioning.
Consider the application of Remembrance-Evaluation model to the monitoring behavior of healthy older adults and AD patients. Growing evidence suggests that, when compared to young adults, healthy older adults show a selective monitoring deficit on source memory tests and cued-recall tests that is caused by high confidence errors; but they show no monitoring deficit when evaluating recognition judgments (e.g., Dodson, Bawa & Krueger, 2007
). The selectivity of the monitoring deficit in older adults suggests that it is caused by a malfunctioning Remembrance mechanism and not by problems with the Evaluation mechanism. In other words, older adults evaluate remembered information normally; their monitoring deficit is caused by a tendency to remember false information. By contrast, AD patients’ monitoring deficit shows a different signature: it is more widespread and occurs on both recognition and source identification judgments. Moreover, because (1) we observed this monitoring deficit on both recognition judgments and source judgments and, (2) given the general assumption that recognition judgments are more familiarity-based than are source judgments, we see no evidence from our study for the notion that AD patients show relatively preserved monitoring of familiarity information and impaired monitoring of recollection information (Souchay, 2007
). Overall, then, we suggest that both stages of the Remembrance-Evaluation model are malfunctioning in AD patients. That is, AD patients are prone to falsely recollect information that leads to high confidence errors and they are impaired at evaluating remembered information in order to provide a confidence rating about its likely accuracy.
Consideration of the brain pathology in AD, and more generally the neural correlates of memory-monitoring, may aid our understanding of why both stages of the Remembrance-Evaluation model are faulty in AD patients. In addition to the medial temporal lobes, AD patients also show pathology in parietal (McKee et al., 2006
) and frontal (Lidstrom et al., 1998
) cortex. Functional neuroimaging studies have documented a relationship between all of these foregoing brain areas and monitoring the accuracy of memory (e.g., Chua, Schacter, Rand-Giovanetti & Sperling, 2006). However, patient studies may provide more direction about the role of particular brain regions in this monitoring process. There is increasing evidence that the parietal lobes may underlie the subjective memorial experience of recollection (e.g., Ally et al., 2008
; Davidson et al., 2008
; Simons et al., 2010
). Simons et al (2010)
collected confidence ratings by patients with unilaterial or bilateral lesions of the parietal lobe and controls with a combined recognition and source recollection paradigm that is nearly identical to our task. They observed no differences between either group of parietal patients and controls on either recognition accuracy or on the overall average confidence rating associated with the recognition judgment. By contrast, even though there were no differences between the parietal patients and controls on source recollection performance, the patients with bilateral parietal lesions showed significantly lower confidence ratings than controls in their source identification judgments. These results are consistent with their subjective recollection hypothesis that the parietal lobes contribute to the recovery and assessment of details that contribute to recollective experience – and, which in turn contribute to confidence ratings for recollective judgments. However, Simons et al.’s observations of diminished confidence on the source identification task that accompanies bilateral parietal damage is the opposite of what we have observed in our AD patients: in contrast to diminished confidence, our AD patients show excessive high confidence errors on the nearly identical source recollection task. With respect to our remembrance-evaluation model, the Simons et al study suggests that the bilateral parietal patients show preserved evaluation processes – which account for their preserved monitoring performance on the recognition task – but an impaired and diminished remembrance process that accounts for their reduced confidence ratings on the source task.
While, to our knowledge, there are no studies involving patients with frontal lobe damage that have used a combined recognition/source-identification and confidence task, there is growing evidence from other tasks that suggests the involvement of the frontal lobes in metamemorial judgments. For instance, frontal patients are worse than matched controls at providing accurate feeling of knowing (FOK) judgments about the likelihood of recognizing a not-recalled answer on an episodic cued-recall task (e.g., Janowsky, Shimamura, & Squire, 1989
; Schnyer et al., 2004
). Interestingly, the worse monitoring performance by the frontal lobe patients tends to occur because of overconfidence. For instance, Janowsky et al observed that frontal patients were less likely than controls to correctly recognize unrecalled items when both groups were either moderately or highly confident that they would be able to correctly recognize the item. This pattern of overconfidence is similar to the overconfidence that we have observed in our AD patients. By contrast, patients with medial temporal lobe damage are no different from matched controls in the accuracy of their FOK judgments (see Pannu & Kaszniak, 2005
Overall, then, there are multiple ways that the frontal lobes could contribute to malfunctioning Remembrance and Evaluation processes. According to Schacter and colleague’s constructive memory framework, the frontal lobes contribute both to retrieval-cue specification (i.e., focusing) processes and to evaluation processes (Schacter, Norman & Koutstaal, 1999). Poor cue specification can activate memories that are either not appropriate for the target task or are inappropriately vague. In addition, inappropriate evaluation criteria when judging memories has been used to explain the pathologically high false recognition rates by frontal patients (e.g., Schacter, Curran et al., 1996
). Both of these processes can explain our findings. AD patients may not be specifying the appropriate retrieval cues, which may cause them to misremember past events – thus leading to high-confidence errors on the source task. Moreover, a malfunctioning evaluation process may cause the AD patients to inappropriately use the confidence rating scale on both recognition and source judgment tasks. The AD patients’ frontal lobe pathology likely explains much of our findings.
The malfunctioning of the Evaluation stage of the Remembrance-Evaluation model may also help to explain the abnormally liberal recognition response bias seen in patients with AD. It has been shown previously that AD patients exhibit a more liberal response bias compared with healthy older adults (Budson et al., 2006
), although the patients are able to shift to a more conservative bias when given an instructional manipulation (such as being told that only 30% of test items have been studied) (Waring, Chong, Wolk & Budson, 2008
). Importantly for the present study, Budson et al. (2006)
found that the recognition response bias of AD patients remained more liberal than that of healthy older adults even when their recognition memory performance was equated by varying the length of study and test lists. Although the factors that contribute to recognition response bias presumably involve both conscious evaluation and unconscious processes, that patients with AD showed impaired evaluation of their memory in the present experiment may provide an important clue as to why patients with AD show an abnormally liberal response bias. We found that both groups of AD patients were comparably confident about the accuracy of both their correct and incorrect recognition responses to studied items, in contrast to the control groups who each showed higher rates of confidence for their correct than for their incorrect responses. Future studies can work towards examining both memorial confidence and recognition response bias to better understand the relationship between these processes in patients with AD.
It is important to note that the monitoring results from our recognition judgment (i.e., worse monitoring performance by the AD patients than older controls) conflict with the findings of Moulin et al. (2003)
who observed that AD patients were no different from healthy older adults in monitoring the accuracy of recognition judgments. However, a fundamental difference between our recognition task and that of Moulin et al. is that they used a 2 alternative, forced-choice procedure in which participants were presented simultaneously with an old event and a new event. It likely is easier to assess the accuracy of a recognition judgment when one is evaluating a previously encountered event in relation to a novel event that is simultaneously present on the screen. That is, one’s confidence that a recognition judgment is correct can be determined by both the certainty that one event is old as well as the certainty that the other event is novel – a quasi-triangulation method for determining the likely accuracy of a recognition judgment. By contrast, fewer cues are available for making this monitoring judgment in our recognition procedure in which a single event is presented and participants endorse the event as old or new and then assess the likely accuracy of this judgment. Thus, this difference in recognition results between our study and Moulin et al. suggests that the evaluation stage on the part of AD patients can be facilitated by providing additional cues or support at retrieval for making a memory judgment, as in Moulin et al.’s study.
There are a couple of potential limitations to our study and points to consider for future research. First, there are a small number of items that each individual contributes to the overall data (i.e., only 30 items presented at encoding to each participant). Although the small number of items is a limitation, it should be noted that our central effects were replicated within each AD group and within each older adult group, which testifies to the reliability of these effects. Second, we used different matching procedures in our AD-m group (i.e., extra repetitions of the study material) and in our Older-m adult group (i.e., adding a large number of filler items at encoding to increase study list length) in order to match accuracy and thus remove differences in accuracy as a confounding explanation for differences in monitoring performance. However, is it possible that these matching manipulations have introduced a confound by changing the cognitive processes associated with the monitoring judgment? In answer to this question it is critical to point out that the older group that received a matching manipulation showed a pattern of monitoring performance that was nearly identical to the pattern shown by the older adult group that did not receive a matching manipulation. The same observation holds for the AD-m and the AD group. Because the same pattern of monitoring performance occurs within each group of older adults and within each group of AD patients, it is reasonable to conclude that the difference in monitoring ability between both groups of AD patients and both groups of older adults is caused by some variable related to Alzheimer’s disease and is not caused either 1) by differences in memory strength since accuracy was equated or 2) by the matching manipulations.
A point to consider in future research is the effect of our instructions to participants to try to use the entire scale of confidence ratings over the course of the experiment. This instruction is common in confidence-rating studies because it is intended to orient participants to use and interpret the confidence rating scale in a similar manner and thus to minimize biases to either use one end of the scale or the middle points only or the end-points only, etc. Our analyses of the distribution of confidence ratings for each judgment – regardless of accuracy – suggest that there were no differences in the overall use of the scale between our AD patients and the healthy older adults. However, it is a question for future research whether AD patients would show better confidence-accuracy correspondence when they are not given these instructions to “use the entire scale.”
Finally, our neuropsychological tests showed that the individuals in the AD-m group appeared less impaired than those in the AD group, as the AD-m individuals showed significantly better MMSE and TMT-B scores and were numerically better on all of the other neuropsychological tests. We suggest that there are important theoretical implications to this group difference to the extent that it is the case that this reduced cognitive impairment is due to the AD-m individuals, on average, showing an earlier stage of the disease. Despite experiencing an earlier stage of the disease, the AD-m individuals show a devastating monitoring impairment that in many ways is just as severe as the monitoring impairment by the AD group. Given (1) that overall objective memory performance is matched between the AD-m and the Older-m group and (2) the apparent early stage of the disease, we therefore suggest that the episodic memory monitoring impairment that we have documented may manifest itself in AD patients before a corresponding impairment in memory performance (see Galeone et al., 2011
for a related argument).
In conclusion, we observed a dissociation between AD patients’ recognition and source accuracy on the one hand, and their ability to monitor and assess the likely accuracy of their recognition and source judgments on the other. Strikingly, the AD patients’ monitoring impairment persists even when their memory accuracy is boosted to that of a healthy older group which indicates that the AD’s monitoring difficulties are not caused by difficulties remembering the information. There are serious practical consequences to this kind of monitoring impairment. Our finding that AD patients are prone to make high-confidence source errors indicates that they may be particularly vulnerable to disrupted medication regimens in which they either fail to take or repeatedly take medication because they are highly confident in the accuracy of the memory that supports this action. Similarly, patients may be highly confident in their memory that they have already turned the stove off when they have not. In addition, we speculate that the high-confidence source errors that we have observed on the part of our AD patients are similar to the confabulatory responses that are exhibited by these individuals and that consequently there may be a common mechanism underlying both behaviors. We are hopeful that studies such as ours that evaluate memorial confidence can lead to future behavioral and pharmacological interventions that may lead to patients’ improved ability to question and probe their own memories prior to acting. Such interventions may be able to improve the lives of patients and their families.